Data Sciense Course - Spring 1403

Project Phase1 - Introduction to Data Science

Mohammadreza Mohammadhashemi : 810100206

Soheil Hajian Manesh : 810100119

Mahdi Ebrahimi Soltani : 810100241

Problem Description¶

In this phase of our project, we will be focusing on web scraping data from the Fiverr.com website using BeautifulSoup and Selenium. The goal is to extract meaningful statistics from the dataset through exploratory data analysis (EDA), visualization, and preprocessing steps.Fiverr.com is an online marketplace that connects freelancers with clients who need various services. Launched in 2010, Fiverr offers a wide range of digital services, known as "gigs," which include categories such as graphic design, writing, translation, video editing, programming, digital marketing, and more. We collect about 6000 data records from 10 different categories.

In [ ]:
import pandas as pd
import numpy as np
from ydata_profiling import ProfileReport
from ydata_profiling.config import Settings
from ydata_profiling.report.presentation.flavours.html import templates
import matplotlib.pyplot as plt
import seaborn as sns
import gc
import scipy.stats as stats
In [ ]:
df = pd.read_csv("./FiverrFinal.csv")
In [ ]:
df.drop_duplicates(inplace=True)
In [ ]:
def print_unique_elements(df, columns):
    for column in columns:
        try:
            print(f"\033[94m{column}\033[0m:")
            unique_counts = df[column].value_counts()
            i = 0
            for value, count in unique_counts.items():
                i = i + 1
                print(f"        \033[91m{value}\033[0m {count}")
        except KeyError:
            print(f"    Column '{column}' not found in the DataFrame.")
In [ ]:
print_unique_elements(df, ['Basic Price'])
Basic Price:
        $10 678
        $5 490
        $100 412
        $30 363
        $15 327
        $50 312
        $20 301
        $25 237
        $150 189
        $40 165
        $35 146
        $80 116
        $200 102
        $45 96
        $90 95
        $60 95
        $75 73
        $250 63
        $120 61
        €9.67 58
        $300 57
        $70 56
        $125 56
        $65 45
        €4.84 44
        $500 40
        $95 36
        $55 36
        $140 35
        €14.51 33
        €19.34 29
        $85 28
        €48.36 27
        €96.72 26
        $350 26
        $130 25
        $110 24
        $105 22
        $160 19
        €145.07 19
        $180 18
        $135 18
        €29.01 18
        $400 17
        €24.18 17
        $195 17
        $145 15
        $175 14
        $1,000 14
        $600 13
        $115 12
        $170 12
        €87.04 11
        $225 10
        €43.52 10
        $295 10
        $190 9
        $2,000 9
        $450 9
        ₪115.23 8
        $220 8
        €38.69 8
        €193.43 7
        €58.03 7
        $245 7
        €241.79 7
        US$10 7
        ₪19.21 7
        £82.38 7
        $50Save up to 10% with Subscribe to Save 6
        $275 6
        $375 6
        US$30 6
        €290.15 6
        $800 6
        £24.71 6
        US$15 6
        £4.12 6
        US$50 6
        €77.41 5
        ₪192.06 5
        $210 5
        $185 5
        $155 5
        US$5 5
        $650 5
        $1,800 5
        $1,200 5
        $290 5
        $1,500 5
        $3,000 5
        $995 5
        £41.19 5
        $240 5
        ₪38.41 5
        $20Save up to 10% with Subscribe to Save 4
        €53.19 4
        $270 4
        $100Save up to 10% with Subscribe to Save 4
        $900 4
        $30Save up to 10% with Subscribe to Save 4
        £123.57 4
        $550 4
        $10Save up to 10% with Subscribe to Save 4
        €72.54 4
        $4,000 4
        $395 4
        €33.85 4
        €82.21 4
        $475 4
        $950 4
        $165 4
        €174.09 4
        $325 4
        $390 4
        US$100 4
        €106.39 3
        ₪57.62 3
        €116.11 3
        $2,500 3
        €96.76 3
        $495 3
        €154.74 3
        ₪384.11 3
        $750 3
        $335 3
        $590 3
        ₪307.29 3
        ₪96.03 3
        £28.83 3
        €67.70 3
        €483.58 3
        US$20 3
        €125.73 3
        $1,350 3
        US$55 3
        ₪576.17 3
        ₪768.22 3
        $255 3
        €116.06 3
        $45Save up to 10% with Subscribe to Save 2
        £74.14 2
        €82.25 2
        $250Save up to 15% with Subscribe to Save 2
        ₪960.28 2
        £8.24 2
        $340 2
        $25Save up to 20% with Subscribe to Save 2
        $40Save up to 5% with Subscribe to Save 2
        $15Save up to 10% with Subscribe to Save 2
        €435.22 2
        £45.31 2
        ₪134.44 2
        $3,400 2
        $380 2
        $445 2
        $25Save up to 10% with Subscribe to Save 2
        US$150 2
        US$70 2
        US$35 2
        $200Save up to 10% with Subscribe to Save 2
        $230 2
        ₪76.82 2
        €338.50 2
        $1,100 2
        $1,495 2
        $60Save up to 10% with Subscribe to Save 2
        €77.37 2
        $1,400 2
        $1,250 2
        $10Save up to 15% with Subscribe to Save 2
        €91.88 2
        €193.52 2
        $700 2
        $125Save up to 15% with Subscribe to Save 2
        $480 2
        $125Save up to 10% with Subscribe to Save 2
        US$95 2
        €19.35 2
        ₪153.64 2
        $30Save up to 5% with Subscribe to Save 2
        $320 2
        $285 2
        $20,000 2
        $490 2
        US$155 1
        ₪2,843 1
        $1,875 1
        $555 1
        $740 1
        €377.19 1
        US$500 1
        €149.91 1
        ₪633.78 1
        €285.31 1
        $10,000 1
        $405 1
        €391.70 1
        US$390 1
        $280 1
        $560 1
        $440 1
        $810 1
        $7,500 1
        €411.04 1
        €531.93 1
        US$1,200 1
        US$200 1
        $2,700 1
        $95Save up to 5% with Subscribe to Save 1
        €822.08 1
        $4,500 1
        €406.20 1
        $505 1
        $25Save up to 15% with Subscribe to Save 1
        $540 1
        US$120 1
        $1,365 1
        $455 1
        €101.55 1
        $795 1
        $75Save up to 10% with Subscribe to Save 1
        $460 1
        €169.25 1
        $2,695 1
        $265 1
        $1,600 1
        US$10,000 1
        US$25 1
        $370 1
        $250Save up to 20% with Subscribe to Save 1
        $3,750 1
        $890 1
        €87.08 1
        £12.36 1
        $1,005 1
        $1,225 1
        €962.76 1
        €382.20 1
        €1,161 1
        €967.60 1
        US$35Save up to 10% with Subscribe to Save 1
        $430 1
        $360 1
        £2,472 1
        $660 1
        €488.64 1
        €125.79 1
        $315 1
        $1,295 1
        $6,500 1
        $2,900 1
        $3,600 1
        €130.57 1
        $610 1
        $50Save up to 5% with Subscribe to Save 1
        $745 1
        £20.60 1
        $15Save up to 20% with Subscribe to Save 1
        €29.01Save up to 10% with Subscribe to Save 1
        US$190 1
        $60Save up to 15% with Subscribe to Save 1
        €48.36Save up to 15% with Subscribe to Save 1
        €164.42Save up to 10% with Subscribe to Save 1
        US$65 1
        $170Save up to 10% with Subscribe to Save 1
        ₪518.55 1
        €9.67Save up to 20% with Subscribe to Save 1
        US$45 1
        US$350 1
        ₪172.85 1
        €507.76 1
        €48.36Save up to 5% with Subscribe to Save 1
        $670 1
        $200Save up to 20% with Subscribe to Save 1
        €241.90 1
        $150Save up to 10% with Subscribe to Save 1
        €628.65Save up to 10% with Subscribe to Save 1
        $690 1
        $100Save up to 15% with Subscribe to Save 1
        $15Save up to 5% with Subscribe to Save 1
        €774.08 1
        $980 1
        £152.41 1
        $20Save up to 20% with Subscribe to Save 1
        $525 1
        €435.42 1
        $50Save up to 20% with Subscribe to Save 1
        $20Save up to 15% with Subscribe to Save 1
        $80Save up to 20% with Subscribe to Save 1
        €33.87 1
        €9.68 1
        £107.10Save up to 10% with Subscribe to Save 1
        $30Save up to 20% with Subscribe to Save 1
        £1,607Save up to 20% with Subscribe to Save 1
        ₪364.91 1
        $425 1
        ₪2,497 1
        £131.81 1
        €967.15 1
        €333.67Save up to 15% with Subscribe to Save 1
        €478.74 1
        ₪921.87 1
        $11,400 1
        $3,200 1
        €203.10 1
        $3,500 1
        $1,300 1
        €3,676 1
        $20Save up to 5% with Subscribe to Save 1
        ₪749.02 1
        $1,755 1
        $920 1
        €449.73 1
        $10Save up to 5% with Subscribe to Save 1
        €120.89 1
        $35Save up to 20% with Subscribe to Save 1
        €962.32 1
        £61.79 1
        $15Save up to 15% with Subscribe to Save 1
        $100Save up to 20% with Subscribe to Save 1
        US$300 1
        $60Save up to 20% with Subscribe to Save 1
        $120Save up to 20% with Subscribe to Save 1
        $345 1
        ₪249.67 1
        £65.90 1
        US$110 1
        $235 1
        US$290 1
        US$295 1
        $3,800 1
        $990 1
In [ ]:
df['Seller In Same Level'] = df['Seller In Same Level'].str.replace(',', '').str.replace('+', '').str.extract('(\d+)').astype(int)
In [ ]:
print_unique_elements(df, ['Seller In Same Level'])
Seller In Same Level:
        1000 226
        2800 183
        2000 133
        313 130
        8900 96
        2200 95
        1800 94
        2700 94
        12000 93
        606 92
        1100 91
        2400 91
        533 89
        1200 89
        23 68
        683 48
        6100 48
        44000 48
        4300 48
        298 48
        26000 48
        5900 48
        3800 48
        53000 48
        33000 48
        1773 48
        6000 48
        1700 48
        7281 48
        16000 48
        2100 48
        2900 48
        50472 48
        361 48
        2248 48
        5500 48
        15000 48
        641 47
        620 47
        7400 47
        14000 47
        1600 47
        1500 47
        10000 47
        120000 47
        400 47
        4600 47
        2600 46
        959 46
        17000 46
        6176 46
        895 46
        6393 46
        524 46
        662 46
        23000 46
        449 46
        553 46
        41000 46
        6400 46
        75000 46
        2500 46
        258 45
        988 45
        8100 45
        19000 45
        763 45
        5800 45
        7800 45
        960 44
        352 44
        177 44
        502 44
        155 44
        170 44
        198 44
        496 44
        550 44
        608 43
        107 43
        215 43
        401 43
        328 43
        406 43
        173 43
        329 43
        459 43
        151 42
        315 42
        128 42
        7900 42
        100 42
        5400 42
        126 42
        181 41
        76 41
        292 41
        293 40
        132 39
        92 38
        35 38
        172 38
        38 38
        461 37
        55 35
        34 34
        65 34
        32 32
        27 27
        25 26
        22 24
        18 20
        9 9
        1730 8
        7 7
        2 2
        1 1
In [ ]:
df['Basic Price'] = df['Basic Price'].str.split("Save up to").str[0]
print_unique_elements(df, ['Basic Price'])
Basic Price:
        $10 685
        $5 490
        $100 418
        $30 370
        $15 332
        $50 320
        $20 308
        $25 242
        $150 190
        $40 167
        $35 147
        $80 117
        $200 105
        $60 99
        $45 98
        $90 95
        $75 74
        $250 66
        $120 62
        $125 60
        €9.67 59
        $300 57
        $70 56
        $65 45
        €4.84 44
        $500 40
        $95 37
        $55 36
        $140 35
        €14.51 33
        €48.36 29
        €19.34 29
        $85 28
        €96.72 26
        $350 26
        $130 25
        $110 24
        $105 22
        €145.07 19
        $160 19
        €29.01 19
        $180 18
        $135 18
        $400 17
        $195 17
        €24.18 17
        $145 15
        $175 14
        $1,000 14
        $600 13
        $170 13
        $115 12
        €87.04 11
        $225 10
        €43.52 10
        $295 10
        $450 9
        $190 9
        $2,000 9
        $220 8
        ₪115.23 8
        €38.69 8
        ₪19.21 7
        €58.03 7
        $245 7
        €241.79 7
        €193.43 7
        £82.38 7
        US$10 7
        $375 6
        $800 6
        £24.71 6
        €290.15 6
        US$50 6
        US$30 6
        £4.12 6
        $275 6
        US$15 6
        ₪192.06 5
        $3,000 5
        ₪38.41 5
        $290 5
        $995 5
        $240 5
        US$5 5
        $210 5
        $155 5
        $1,200 5
        $1,800 5
        $650 5
        £41.19 5
        $185 5
        $1,500 5
        €77.41 5
        $550 4
        $165 4
        $390 4
        $325 4
        US$100 4
        €174.09 4
        $4,000 4
        $475 4
        €72.54 4
        €82.21 4
        $395 4
        $270 4
        $900 4
        €53.19 4
        €33.85 4
        $950 4
        £123.57 4
        US$20 3
        ₪307.29 3
        €116.11 3
        €67.70 3
        $590 3
        ₪96.03 3
        ₪57.62 3
        £28.83 3
        US$35 3
        €483.58 3
        €106.39 3
        US$55 3
        ₪768.22 3
        €154.74 3
        $495 3
        $750 3
        $1,350 3
        $2,500 3
        $255 3
        $335 3
        €125.73 3
        €96.76 3
        ₪384.11 3
        ₪576.17 3
        €116.06 3
        €82.25 2
        $1,100 2
        £45.31 2
        ₪153.64 2
        ₪134.44 2
        $380 2
        €435.22 2
        $490 2
        €91.88 2
        $1,250 2
        US$150 2
        US$70 2
        $700 2
        US$95 2
        $230 2
        $1,495 2
        ₪960.28 2
        $340 2
        $3,400 2
        $320 2
        ₪76.82 2
        €193.52 2
        €77.37 2
        $480 2
        $445 2
        £74.14 2
        $20,000 2
        $1,400 2
        €19.35 2
        €338.50 2
        $285 2
        £8.24 2
        ₪749.02 1
        $7,500 1
        €967.60 1
        €382.20 1
        €962.76 1
        $2,700 1
        $1,225 1
        $1,005 1
        US$155 1
        £12.36 1
        $890 1
        €822.08 1
        €967.15 1
        US$200 1
        $555 1
        $1,875 1
        $10,000 1
        €377.19 1
        US$500 1
        €149.91 1
        $280 1
        $440 1
        US$1,200 1
        US$390 1
        $560 1
        €3,676 1
        €507.76 1
        €87.08 1
        £2,472 1
        $370 1
        US$25 1
        US$10,000 1
        $1,600 1
        $265 1
        $2,695 1
        €169.25 1
        $460 1
        $430 1
        $795 1
        $4,500 1
        €101.55 1
        $455 1
        $1,365 1
        US$120 1
        $540 1
        $505 1
        €406.20 1
        $3,750 1
        $360 1
        $2,900 1
        $6,500 1
        $1,295 1
        $315 1
        €125.79 1
        $405 1
        $660 1
        €488.64 1
        €411.04 1
        $810 1
        $690 1
        £131.81 1
        ₪2,497 1
        $425 1
        $920 1
        $980 1
        €449.73 1
        $345 1
        €241.90 1
        £152.41 1
        US$295 1
        $525 1
        €435.42 1
        $235 1
        US$110 1
        £65.90 1
        ₪249.67 1
        €120.89 1
        €33.87 1
        €9.68 1
        £61.79 1
        $670 1
        US$65 1
        $745 1
        £20.60 1
        US$190 1
        ₪518.55 1
        US$45 1
        €774.08 1
        €130.57 1
        $3,200 1
        €478.74 1
        €531.93 1
        ₪172.85 1
        $3,800 1
        €391.70 1
        ₪2,843 1
        €285.31 1
        ₪633.78 1
        US$290 1
        $740 1
        $3,600 1
        €1,161 1
        $610 1
        £1,607 1
        $1,755 1
        US$300 1
        €962.32 1
        £107.10 1
        $1,300 1
        US$350 1
        €164.42 1
        $3,500 1
        €203.10 1
        $11,400 1
        €333.67 1
        ₪364.91 1
        ₪921.87 1
        €628.65 1
        $990 1
In [ ]:
from forex_python.converter import CurrencyRates

def find_exchange_rates():
    try:
        cr = CurrencyRates()
        euro_ex_rate = cr.get_rate('EUR', 'USD')
        shekle_ex_rate = cr.get_rate('ILS', 'USD')
        pound_ex_rate = cr.get_rate('GBP', 'USD')
        return {'$': 1, '€': euro_ex_rate, '₪': shekle_ex_rate, 'US$': 1, '£': pound_ex_rate}  
    except :
        return {'$': 1, '€': 1.08, '₪': 0.27, 'US$': 1, '£': 1.27} 
In [ ]:
exchange_rates = find_exchange_rates()

def convert_to_usd(price):
    for currency, exchange_rate in exchange_rates.items():
        if price.startswith(currency):
            numeric_price = float(price.lstrip(currency).replace(',', ''))
            return round(numeric_price / exchange_rate, 2)
    return price

df['Basic Price'] = df['Basic Price'].apply(convert_to_usd)
In [ ]:
print_unique_elements(df, ['Basic Price'])
Basic Price:
        10.0 692
        5.0 495
        100.0 422
        30.0 376
        15.0 338
        50.0 326
        20.0 311
        25.0 243
        150.0 192
        40.0 167
        35.0 150
        80.0 117
        200.0 106
        60.0 99
        45.0 99
        90.0 95
        75.0 74
        250.0 66
        120.0 63
        125.0 60
        8.95 59
        300.0 58
        70.0 58
        65.0 46
        4.48 44
        500.0 41
        55.0 39
        95.0 39
        140.0 35
        13.44 33
        44.78 29
        17.91 29
        85.0 28
        350.0 27
        89.56 26
        110.0 25
        130.0 25
        105.0 22
        134.32 19
        26.86 19
        160.0 19
        180.0 18
        135.0 18
        195.0 17
        22.39 17
        400.0 17
        145.0 15
        175.0 14
        1000.0 14
        600.0 13
        170.0 13
        115.0 12
        80.59 11
        295.0 11
        190.0 10
        225.0 10
        40.3 10
        450.0 9
        2000.0 9
        35.82 8
        426.78 8
        220.0 8
        71.15 7
        53.73 7
        64.87 7
        245.0 7
        223.88 7
        179.1 7
        800.0 6
        290.0 6
        268.66 6
        1200.0 6
        3.24 6
        375.0 6
        275.0 6
        155.0 6
        19.46 6
        210.0 5
        3000.0 5
        995.0 5
        650.0 5
        1500.0 5
        71.68 5
        142.26 5
        1800.0 5
        32.43 5
        390.0 5
        711.33 5
        185.0 5
        240.0 5
        67.17 4
        76.12 4
        97.3 4
        475.0 4
        900.0 4
        270.0 4
        49.25 4
        325.0 4
        161.19 4
        31.34 4
        4000.0 4
        950.0 4
        165.0 4
        550.0 4
        395.0 4
        2500.0 3
        1422.63 3
        89.59 3
        750.0 3
        335.0 3
        1138.11 3
        355.67 3
        590.0 3
        22.7 3
        143.28 3
        62.69 3
        213.41 3
        98.51 3
        447.76 3
        116.42 3
        255.0 3
        1350.0 3
        2845.26 3
        107.51 3
        107.46 3
        2133.96 3
        495.0 3
        445.0 2
        3400.0 2
        380.0 2
        490.0 2
        76.16 2
        58.38 2
        1495.0 2
        284.52 2
        20000.0 2
        569.04 2
        313.43 2
        700.0 2
        1100.0 2
        1250.0 2
        402.98 2
        85.07 2
        10000.0 2
        230.0 2
        71.64 2
        6.49 2
        1400.0 2
        340.0 2
        35.68 2
        17.92 2
        480.0 2
        179.19 2
        285.0 2
        3556.59 2
        497.93 2
        320.0 2
        349.25 1
        2695.0 1
        761.19 1
        555.0 1
        890.0 1
        9.73 1
        265.0 1
        1365.0 1
        1600.0 1
        138.81 1
        280.0 1
        1005.0 1
        370.0 1
        560.0 1
        1875.0 1
        2700.0 1
        7500.0 1
        405.0 1
        440.0 1
        1225.0 1
        156.71 1
        452.44 1
        505.0 1
        376.11 1
        3750.0 1
        360.0 1
        2900.0 1
        6500.0 1
        1295.0 1
        315.0 1
        116.47 1
        455.0 1
        660.0 1
        540.0 1
        1946.46 1
        430.0 1
        94.03 1
        80.63 1
        4500.0 1
        795.0 1
        895.93 1
        460.0 1
        353.89 1
        891.44 1
        670.0 1
        610.0 1
        810.0 1
        492.53 1
        345.0 1
        3800.0 1
        891.04 1
        235.0 1
        84.33 1
        51.89 1
        924.7 1
        111.94 1
        152.24 1
        48.65 1
        308.95 1
        582.08 1
        690.0 1
        716.74 1
        980.0 1
        470.15 1
        640.19 1
        223.98 1
        120.01 1
        1920.56 1
        525.0 1
        403.17 1
        31.36 1
        8.96 1
        16.22 1
        416.42 1
        3200.0 1
        920.0 1
        1755.0 1
        380.59 1
        2774.15 1
        362.69 1
        10529.63 1
        264.18 1
        2347.33 1
        740.0 1
        3600.0 1
        1075.0 1
        745.0 1
        1265.35 1
        3403.7 1
        425.0 1
        1300.0 1
        3500.0 1
        188.06 1
        11400.0 1
        1351.52 1
        3414.33 1
        443.28 1
        120.9 1
        895.51 1
        103.79 1
        9248.15 1
        990.0 1
In [ ]:
df['Standard Price'] = df['Standard Price'].str.split("Save up to").str[0]
df['Standard Price'] = df['Standard Price'].apply(convert_to_usd)
print_unique_elements(df, ['Standard Price'])
Standard Price:
        50.0 356
        20.0 263
        100.0 243
        30.0 235
        10.0 233
        150.0 227
        25.0 226
        40.0 202
        15.0 195
        200.0 182
        60.0 178
        250.0 159
        80.0 148
        70.0 137
        300.0 116
        120.0 108
        35.0 107
        75.0 105
        90.0 102
        45.0 93
        5.0 79
        500.0 76
        350.0 69
        400.0 60
        180.0 60
        65.0 58
        95.0 53
        55.0 52
        125.0 52
        130.0 41
        160.0 41
        600.0 37
        140.0 36
        450.0 36
        110.0 36
        175.0 32
        1000.0 32
        85.0 32
        195.0 31
        44.78 30
        1500.0 27
        190.0 26
        225.0 25
        550.0 25
        8.95 23
        155.0 23
        17.91 22
        22.39 20
        220.0 20
        800.0 20
        13.44 20
        170.0 20
        280.0 19
        115.0 18
        26.86 18
        240.0 17
        295.0 17
        270.0 16
        165.0 15
        145.0 15
        135.0 15
        750.0 14
        179.1 14
        230.0 14
        320.0 14
        134.32 14
        105.0 14
        1200.0 13
        700.0 13
        4000.0 13
        89.56 13
        2000.0 12
        395.0 12
        2500.0 12
        290.0 11
        5000.0 11
        67.17 11
        650.0 11
        260.0 11
        71.64 11
        275.0 11
        223.88 11
        210.0 11
        35.82 10
        185.0 10
        245.0 10
        375.0 10
        53.73 10
        268.66 9
        3000.0 9
        31.34 9
        62.69 9
        360.0 8
        107.46 8
        495.0 8
        4.48 8
        325.0 8
        595.0 8
        40.3 7
        205.0 7
        447.76 7
        1800.0 7
        900.0 7
        475.0 6
        215.0 6
        358.2 6
        582.08 5
        480.0 5
        285.0 5
        390.0 5
        58.2 5
        853.59 5
        161.19 5
        2400.0 4
        420.0 4
        850.0 4
        330.0 4
        995.0 4
        695.0 4
        71.15 4
        235.0 4
        380.0 4
        1995.0 4
        626.86 4
        355.67 4
        1250.0 4
        80.59 4
        116.42 4
        1422.63 4
        64.87 4
        711.33 4
        537.31 4
        98.51 3
        675.0 3
        156.71 3
        590.0 3
        174.62 3
        990.0 3
        490.0 3
        305.0 3
        340.0 3
        4500.0 3
        38.92 3
        129.73 3
        49.25 3
        6000.0 3
        255.0 3
        9.73 3
        1707.15 3
        250.74 3
        310.0 3
        540.0 3
        365.0 3
        560.0 3
        4270.37 3
        1495.0 3
        1100.0 3
        1375.0 3
        795.0 3
        497.93 2
        1066.96 2
        143.28 2
        89.59 2
        430.0 2
        197.01 2
        7000.0 2
        111.94 2
        2800.0 2
        129.85 2
        569.04 2
        32.43 2
        402.98 2
        345.0 2
        1600.0 2
        268.78 2
        530.0 2
        206.06 2
        335.81 2
        492.53 2
        2845.26 2
        630.0 2
        1450.0 2
        525.0 2
        1300.0 2
        1050.0 2
        445.0 2
        201.49 2
        2560.74 2
        162.17 2
        29.19 2
        3995.0 2
        1795.0 2
        315.0 2
        97.3 2
        250.86 2
        123.24 2
        10000.0 2
        8000.0 2
        85.07 2
        3800.0 2
        1350.0 2
        625.0 2
        510.0 2
        1700.0 2
        313.57 2
        113.52 2
        308.95 2
        3500.0 2
        161.27 2
        2100.0 2
        284.52 2
        76.12 2
        4477.78 2
        20000.0 2
        213.41 2
        6.49 2
        1343.52 2
        640.0 2
        165.75 1
        850.74 1
        2990.0 1
        1210.0 1
        179.19 1
        1792.59 1
        1280.0 1
        194.6 1
        1490.0 1
        3985.19 1
        1810.0 1
        286.56 1
        2995.0 1
        4100.0 1
        15000.0 1
        2235.19 1
        1400.0 1
        16.22 1
        761.19 1
        820.0 1
        1410.0 1
        7500.0 1
        1710.0 1
        2700.0 1
        5400.0 1
        4725.0 1
        4550.0 1
        1787.96 1
        658.51 1
        745.0 1
        376.11 1
        12.98 1
        1791.67 1
        147.76 1
        1075.0 1
        116.76 1
        2310.0 1
        895.0 1
        950.0 1
        165.67 1
        1520.0 1
        90.81 1
        142.26 1
        265.0 1
        1590.0 1
        370.0 1
        1778.3 1
        995.85 1
        2185.0 1
        485.0 1
        1820.0 1
        780.0 1
        760.0 1
        2250.0 1
        273.13 1
        1950.0 1
        1485.0 1
        3892.13 1
        443.48 1
        188.15 1
        71.35 1
        286.69 1
        685.0 1
        631.63 1
        246.38 1
        1565.0 1
        2670.0 1
        9750.0 1
        710.0 1
        12000.0 1
        5900.0 1
        575.0 1
        479.1 1
        1330.0 1
        232.83 1
        3295.0 1
        443.28 1
        48.65 1
        665.0 1
        2489.59 1
        26.88 1
        44.8 1
        8.96 1
        71.68 1
        17.92 1
        870.0 1
        565.0 1
        335.0 1
        410.0 1
        880.0 1
        2062.81 1
        694.02 1
        313.43 1
        19.46 1
        58.38 1
        74.6 1
        42.17 1
        790.0 1
        1138.11 1
        81.09 1
        545.0 1
        3.24 1
        627.15 1
        120.01 1
        5800.0 1
        3556.59 1
        865.0 1
        528.35 1
        455.0 1
        6500.0 1
        2774.15 1
        7114.81 1
        4657.41 1
        5600.0 1
        4400.0 1
        35.83 1
        7400.0 1
        2300.0 1
        891.04 1
        640.19 1
        214.93 1
        84.33 1
        152.24 1
        690.0 1
        1344.44 1
        125.37 1
        205.97 1
        945.0 1
        895.51 1
        716.41 1
        3900.0 1
        1505.0 1
        5692.59 1
        2505.0 1
        1265.35 1
        1215.0 1
        2238.89 1
        840.0 1
        1095.0 1
        1338.89 1
        5977.78 1
        622.38 1
        425.0 1
        15581.48 1
        324.34 1
        599.99 1
        2895.0 1
        940.74 1
        740.0 1
        520.0 1
        405.0 1
        4050.0 1
        295.52 1
        1900.0 1
        920.0 1
        825.0 1
        18496.3 1
        2150.0 1
        440.0 1
        142.71 1
        331.34 1
        385.0 1
        183.58 1
        925.0 1
        11400.0 1
        10670.37 1
        1493.78 1
        645.0 1
        711.94 1
        355.0 1
        1875.0 1
        4981.48 1
        1636.04 1
        415.0 1
        1388.89 1
In [ ]:
df['Premium Price'] = df['Premium Price'].str.split("Save up to").str[0]
df['Premium Price'] = df['Premium Price'].apply(convert_to_usd)
print_unique_elements(df, ['Premium Price'])
Premium Price:
        100.0 365
        50.0 237
        200.0 220
        150.0 217
        30.0 167
        300.0 156
        250.0 150
        500.0 143
        60.0 140
        20.0 138
        40.0 135
        120.0 122
        25.0 119
        15.0 113
        80.0 111
        400.0 110
        90.0 108
        350.0 102
        45.0 91
        5.0 79
        35.0 78
        75.0 72
        10.0 71
        450.0 68
        70.0 66
        180.0 66
        1000.0 66
        65.0 56
        600.0 55
        95.0 54
        130.0 53
        125.0 53
        55.0 46
        160.0 46
        85.0 45
        110.0 43
        1500.0 39
        175.0 34
        800.0 32
        700.0 31
        170.0 30
        190.0 29
        750.0 29
        195.0 28
        220.0 25
        240.0 25
        1200.0 24
        650.0 24
        89.56 24
        225.0 23
        2000.0 22
        140.0 22
        550.0 22
        3000.0 21
        135.0 20
        44.78 20
        145.0 20
        495.0 20
        2500.0 18
        320.0 18
        115.0 18
        375.0 18
        185.0 18
        230.0 18
        900.0 18
        995.0 17
        179.1 17
        105.0 16
        270.0 16
        10000.0 16
        295.0 15
        210.0 15
        390.0 15
        395.0 15
        134.32 15
        26.86 14
        280.0 14
        290.0 14
        155.0 14
        17.91 13
        35.82 13
        850.0 13
        5000.0 13
        260.0 13
        950.0 13
        40.3 12
        8000.0 12
        268.66 12
        165.0 12
        223.88 12
        71.64 11
        22.39 11
        360.0 11
        480.0 11
        1250.0 10
        53.73 10
        325.0 10
        13.44 10
        275.0 10
        8.95 9
        4000.0 9
        3500.0 8
        595.0 8
        345.0 8
        590.0 8
        1600.0 8
        990.0 8
        4.48 8
        385.0 8
        1100.0 8
        490.0 8
        4500.0 8
        447.76 8
        358.2 7
        475.0 7
        255.0 7
        125.37 7
        420.0 7
        520.0 7
        895.51 6
        116.42 6
        425.0 6
        460.0 6
        67.17 6
        285.0 6
        6000.0 6
        370.0 6
        7500.0 6
        20000.0 6
        402.98 6
        1800.0 6
        245.0 6
        380.0 6
        80.59 6
        107.46 5
        340.0 5
        540.0 5
        330.0 5
        313.43 5
        111.94 5
        795.0 5
        1138.11 5
        1350.0 5
        625.0 4
        76.12 4
        695.0 4
        129.73 4
        4981.48 4
        97.3 4
        315.0 4
        71.15 4
        305.0 4
        2133.96 4
        215.0 4
        440.0 4
        335.0 4
        980.0 4
        530.0 4
        161.19 4
        537.31 4
        430.0 4
        1343.52 4
        205.0 4
        497.93 3
        445.0 3
        7000.0 3
        975.0 3
        12.98 3
        365.0 3
        58.2 3
        525.0 3
        415.0 3
        4495.0 3
        31.34 3
        15000.0 3
        49.25 3
        2800.0 3
        197.01 3
        1300.0 3
        2100.0 3
        2700.0 3
        891.04 3
        775.0 3
        355.0 3
        1422.63 3
        492.53 3
        690.0 3
        152.24 3
        1150.0 3
        605.0 3
        671.64 3
        1400.0 3
        711.33 3
        675.0 3
        310.0 3
        143.28 2
        725.0 2
        660.0 2
        630.0 2
        680.0 2
        213.41 2
        2200.0 2
        45.41 2
        61.62 2
        570.0 2
        265.0 2
        286.56 2
        1209.22 2
        8955.56 2
        2238.89 2
        353.73 2
        58.38 2
        470.0 2
        1254.63 2
        7800.0 2
        1900.0 2
        1450.0 2
        1995.0 2
        2687.04 2
        925.0 2
        1040.0 2
        324.34 2
        886.56 2
        2995.0 2
        2490.0 2
        3200.0 2
        4995.0 2
        98.51 2
        6500.0 2
        455.0 2
        14229.63 2
        313.57 2
        880.0 2
        2400.0 2
        3135.19 2
        1791.67 2
        780.0 2
        35566.67 2
        1195.0 2
        3600.0 2
        3556.59 2
        194.6 2
        670.0 2
        3200.93 2
        405.0 2
        1850.0 2
        1795.0 2
        62.69 2
        2300.0 2
        667.16 2
        740.0 2
        181.63 2
        162.17 2
        720.0 2
        403.17 2
        235.0 2
        582.08 2
        985.19 2
        219.4 2
        585.0 2
        29.19 1
        5407.41 1
        761.19 1
        8500.0 1
        1120.0 1
        7020.0 1
        438.81 1
        4125.0 1
        6215.0 1
        745.0 1
        545.0 1
        9985.0 1
        9100.0 1
        8059.26 1
        945.37 1
        7167.59 1
        1260.0 1
        344.94 1
        1495.0 1
        250.86 1
        8900.0 1
        7495.0 1
        3750.0 1
        3310.0 1
        850.74 1
        3990.0 1
        3250.0 1
        6300.0 1
        845.0 1
        2720.0 1
        2821.3 1
        1105.0 1
        2160.0 1
        518.94 1
        716.41 1
        5990.0 1
        2642.59 1
        3050.0 1
        4270.37 1
        3400.0 1
        9995.0 1
        2020.0 1
        10500.0 1
        2975.0 1
        13500.0 1
        439.0 1
        7995.0 1
        465.67 1
        430.05 1
        259.81 1
        165.67 1
        85.07 1
        4585.0 1
        4900.0 1
        505.0 1
        4200.0 1
        1095.0 1
        1330.56 1
        1790.0 1
        429.84 1
        2270.0 1
        655.0 1
        2010.0 1
        407.45 1
        4545.0 1
        2850.0 1
        201.49 1
        575.0 1
        1030.56 1
        5030.0 1
        1220.0 1
        1750.0 1
        1351.52 1
        322.39 1
        3020.0 1
        227.03 1
        1905.0 1
        2195.0 1
        1050.0 1
        891.44 1
        3100.0 1
        376.29 1
        3695.0 1
        129.85 1
        156.71 1
        9730.71 1
        1385.0 1
        820.0 1
        179.19 1
        134.39 1
        895.93 1
        309.09 1
        188.06 1
        3675.0 1
        645.0 1
        447.96 1
        985.0 1
        721.22 1
        358.37 1
        1925.0 1
        1275.0 1
        4090.0 1
        12000.0 1
        3350.0 1
        8795.0 1
        7505.0 1
        790.0 1
        945.0 1
        16.22 1
        77.84 1
        10800.0 1
        6800.0 1
        259.69 1
        5622.22 1
        138.81 1
        3272.07 1
        214.93 1
        640.19 1
        113.52 1
        19.46 1
        64.87 1
        183.58 1
        188.12 1
        6985.19 1
        241.79 1
        13659.26 1
        940.0 1
        995.85 1
        2774.15 1
        1290.0 1
        452.23 1
        1564.89 1
        250.74 1
        174.62 1
        1636.04 1
        4555.56 1
        2750.0 1
        1345.0 1
        420.89 1
        895.0 1
        22.4 1
        3.24 1
        627.15 1
        35.83 1
        71.68 1
        120.01 1
        152.31 1
        85.11 1
        17.92 1
        2805.0 1
        970.0 1
        5745.0 1
        2419.44 1
        48.65 1
        308.95 1
        51.89 1
        84.33 1
        2560.74 1
        782.44 1
        145.95 1
        6.49 1
        1595.0 1
        1305.0 1
        284.52 1
        355.67 1
        10670.37 1
        6403.7 1
        1180.0 1
        2820.0 1
        1645.0 1
        9600.0 1
        1782.41 1
        5500.0 1
        3800.0 1
        1090.0 1
        1523.15 1
        9000.0 1
        805.96 1
        3505.0 1
        1265.35 1
        3995.0 1
        2220.0 1
        3582.41 1
        3195.0 1
        5692.59 1
        541.79 1
        615.0 1
        2060.0 1
        3577.78 1
        1338.89 1
        29307.41 1
        1297.64 1
        841.78 1
        4523.15 1
        2284.26 1
        5600.0 1
        1650.0 1
        810.0 1
        465.0 1
        168.65 1
        805.0 1
        317.91 1
        1075.0 1
        555.22 1
        1550.0 1
        411.94 1
        228.35 1
        860.0 1
        920.0 1
        760.0 1
        1225.0 1
        1280.0 1
        1700.0 1
        840.0 1
        501.49 1
        580.0 1
        2295.0 1
        394.03 1
        191.35 1
        1210.0 1
        610.0 1
        960.0 1
        514.92 1
        981.48 1
        8537.04 1
        2062.81 1
        11400.0 1
        483.57 1
        2250.0 1
In [ ]:
print_unique_elements(df , ['Basic Delivery'])
Basic Delivery:
        2 days 1305
        3 days 1109
        1 day 1105
        5 days 442
        7 days 440
        4 days 322
        10 days 182
        14 days 152
        2-day delivery 142
        1-day delivery 130
        3-day delivery 114
        7-day delivery 81
        6 days 66
        30 days 64
        21 days 56
        5-day delivery 56
        4-day delivery 39
        10-day delivery 29
        30-day delivery 28
        14-day delivery 27
        21-day delivery 13
        6-day delivery 7
        90-day delivery 3
        45 days 3
        60 days 2
        45-day delivery 2
        29-day delivery 1
        60-day delivery 1
        90 days 1
In [ ]:
df['Basic Delivery'] = df['Basic Delivery'].str.extract('(\d+)').astype(int)
print_unique_elements(df , ['Basic Delivery'])
Basic Delivery:
        2 1447
        1 1235
        3 1223
        7 521
        5 498
        4 361
        10 211
        14 179
        30 92
        6 73
        21 69
        45 5
        90 4
        60 3
        29 1
In [ ]:
df['Standard Delivery'] = df['Standard Delivery'].str.extract('(\d+)').astype(int)
print_unique_elements(df , ['Standard Delivery'])
Standard Delivery:
        3 1106
        2 1006
        5 777
        7 709
        4 613
        1 453
        10 381
        14 324
        6 184
        30 161
        21 158
        45 24
        60 17
        90 4
        75 2
        15 2
        29 1
In [ ]:
df['Premium Delivery'] = df['Premium Delivery'].str.extract('(\d+)').astype(int)
print_unique_elements(df , ['Premium Delivery'])
Premium Delivery:
        7 966
        3 829
        5 742
        10 589
        2 532
        4 498
        14 457
        1 353
        30 352
        6 231
        21 218
        45 63
        60 43
        90 35
        75 9
        8 2
        29 1
        28 1
        20 1
In [ ]:
print_unique_elements(df , ['Basic Revision'])
Basic Revision:
        -1 2340
        1 1060
        Unlimited 791
        2 657
        0 323
        3 322
        5 99
        1 Revision 87
        Unlimited Revisions 71
        2 Revisions 53
        4 41
        3 Revisions 30
        9 16
        5 Revisions 12
        6 8
        7 5
        4 Revisions 3
        8 2
        8 Revisions 1
        9 Revisions 1

Raplce Unlimited Revision with a very high value (100) to this field become numeric¶

In [ ]:
df['Basic Revision'] = df['Basic Revision'].replace('Unlimited Revisions', '100').str.extract('(-?\d+)', expand=False).fillna(0).astype(int)
print_unique_elements(df , ['Basic Revision'])
Basic Revision:
        -1 2340
        1 1147
        0 1114
        2 710
        3 352
        5 111
        100 71
        4 44
        9 17
        6 8
        7 5
        8 3
In [ ]:
df['Standard Revision'] = df['Standard Revision'].replace('Unlimited Revisions', '100').str.extract('(-?\d+)', expand=False).fillna(0).astype(int)
print_unique_elements(df , ['Standard Revision'])
Standard Revision:
        -1 2340
        0 1031
        2 887
        1 668
        3 484
        5 218
        4 117
        100 71
        9 37
        6 31
        7 24
        8 14
In [ ]:
df['Premium Revision'] = df['Premium Revision'].replace('Unlimited Revisions', '100').str.extract('(-?\d+)', expand=False).fillna(0).astype(int)
print_unique_elements(df , ['Premium Revision'])
Premium Revision:
        -1 2340
        0 1281
        2 631
        3 583
        1 456
        5 254
        4 136
        100 71
        9 66
        7 42
        6 37
        8 25
In [ ]:
print_unique_elements(df, ['Rating Count'])
Rating Count:
        3 104
        2 102
        1.0 101
        1 93
        4 85
        5 80
        7 76
        11 72
        12 71
        3.0 70
        2.0 67
        13 66
        10 66
        21 65
        16 63
        15 63
        6 62
        22 58
        20 53
        17 52
        18 51
        14 50
        24 47
        19 47
        25 46
        26 46
        8 44
        9 44
        33 43
        5.0 41
        35 40
        29 39
        30 39
        23 36
        28 34
        31 34
        27 33
        40 33
        4.0 33
        34 32
        15.0 30
        8.0 29
        9.0 29
        39 28
        6.0 27
        11.0 27
        48 26
        37 25
        44 25
        12.0 25
        41 25
        63 24
        7.0 24
        43 24
        79 24
        45 24
        46 24
        32 24
        47 23
        18.0 23
        36 23
        49 23
        53 23
        38 23
        42 22
        56 21
        10.0 20
        62 20
        51 20
        50 20
        60 19
        17.0 19
        66 18
        64 18
        54 17
        16.0 17
        57 17
        91 17
        61 17
        73 16
        70 16
        52 16
        101 16
        93 15
        59 15
        14.0 15
        75 15
        92 15
        81 15
        25.0 15
        20.0 14
        55 14
        23.0 14
        68 14
        82 14
        84 14
        131 13
        99 13
        89 12
        13.0 12
        153 12
        107 12
        76 12
        78 12
        24.0 12
        35.0 11
        108 11
        80 11
        71 11
        110 11
        77 11
        122 11
        94 11
        96 11
        114 11
        36.0 10
        145 10
        58 10
        97 10
        19.0 10
        87 10
        197 10
        65 10
        115 10
        72 10
        67 10
        86 10
        177 10
        124 9
        143 9
        219 9
        83 9
        175 9
        29.0 9
        123 9
        102 9
        21.0 9
        103 9
        74 9
        198 9
        22.0 9
        34.0 9
        144 9
        162 9
        104 8
        182 8
        106 8
        135 8
        185 8
        293 8
        174 8
        45.0 8
        69 8
        150 8
        138 8
        105 8
        118 8
        201 8
        100 8
        139 8
        109 8
        211 7
        98 7
        112 7
        209 7
        111 7
        38.0 7
        195 7
        300 7
        116 7
        85 7
        192 7
        128 7
        247 7
        141 7
        137 7
        31.0 7
        164 7
        78.0 7
        199 7
        218 7
        28.0 7
        56.0 6
        171 6
        183 6
        129 6
        229 6
        27.0 6
        58.0 6
        147 6
        243 6
        134 6
        163 6
        412 6
        307 6
        193 6
        41.0 6
        26.0 6
        231 6
        155 6
        237 6
        217 6
        203 6
        309 6
        187 6
        287 6
        119 5
        148 5
        167 5
        121 5
        142 5
        301 5
        30.0 5
        125 5
        168 5
        313 5
        450 5
        90 5
        39.0 5
        336 5
        170 5
        156 5
        190 5
        161 5
        149 5
        133 5
        399 5
        88 5
        157 5
        132 5
        181 5
        32.0 5
        176 5
        256 5
        356 5
        130 5
        126 5
        267 5
        95 5
        274 5
        240 4
        260 4
        341 4
        194 4
        248 4
        387 4
        40.0 4
        402 4
        257 4
        186 4
        189 4
        325 4
        184 4
        284 4
        283 4
        279 4
        173 4
        65.0 4
        74.0 4
        306 4
        236 4
        159 4
        43.0 4
        220 4
        258 4
        351 4
        166 4
        348 4
        165 4
        377 4
        475 4
        172 4
        61.0 4
        385 4
        342 4
        117 4
        44.0 4
        403 4
        501 4
        223 4
        227 4
        160 4
        48.0 4
        295 4
        146 4
        42.0 4
        335 4
        158 4
        33.0 4
        417 4
        238 4
        304 4
        452 4
        136 4
        120 4
        207 4
        1,665 3
        338 3
        978 3
        311 3
        478 3
        355 3
        251 3
        228 3
        235 3
        254 3
        204 3
        179 3
        50.0 3
        483 3
        532 3
        70.0 3
        308 3
        791 3
        882 3
        303 3
        331 3
        568 3
        648 3
        494 3
        425 3
        282 3
        275 3
        261 3
        264 3
        272 3
        384 3
        215 3
        553 3
        249 3
        423 3
        277 3
        768 3
        289 3
        63.0 3
        528 3
        278 3
        246 3
        435 3
        127 3
        398 3
        334 3
        53.0 3
        651 3
        206 3
        59.0 3
        265 3
        188 3
        296 3
        133.0 3
        324 3
        69.0 3
        136.0 3
        392 3
        624 3
        326 3
        312 3
        178 3
        266 3
        302 3
        208 3
        101.0 3
        252 3
        52.0 3
        513 3
        169 3
        504 3
        499 3
        244 3
        411 3
        152 3
        588 2
        354 2
        555 2
        427 2
        932 2
        216.0 2
        665 2
        84.0 2
        290 2
        239 2
        328 2
        191 2
        777 2
        276 2
        546 2
        408 2
        459 2
        414 2
        401 2
        273 2
        480 2
        343 2
        2,452 2
        431 2
        410 2
        232 2
        54.0 2
        602 2
        846 2
        544 2
        569 2
        55.0 2
        350 2
        573 2
        97.0 2
        346 2
        271 2
        177.0 2
        263 2
        292 2
        1,469 2
        234 2
        877 2
        96.0 2
        884 2
        455 2
        102.0 2
        57.0 2
        562 2
        241 2
        731 2
        37.0 2
        349 2
        547 2
        472 2
        609 2
        825 2
        631 2
        314 2
        330 2
        584 2
        225 2
        317 2
        233 2
        841 2
        332 2
        62.0 2
        214 2
        132.0 2
        606 2
        335.0 2
        1,179 2
        545 2
        461 2
        556 2
        352 2
        682 2
        1,098 2
        46.0 2
        754 2
        3,896 2
        180 2
        51.0 2
        529 2
        205 2
        793 2
        1,390 2
        113 2
        5,291 2
        2,834 2
        424 2
        81.0 2
        210 2
        514 2
        1,116 2
        1,205 2
        507 2
        393 2
        216 2
        574 2
        1,806 2
        109.0 2
        561 2
        594 2
        659 2
        1,178 2
        367 2
        464 2
        165.0 2
        458 2
        230 2
        321 2
        262 2
        810 2
        548 2
        652 2
        611 2
        383 2
        72.0 2
        140 2
        253 2
        339 2
        650 2
        563 2
        680 2
        543 2
        154 2
        628 2
        627 2
        703 2
        405 2
        285 2
        294 2
        1,217 2
        1,915 2
        577 2
        347 2
        432 2
        805 2
        394 2
        1,015 2
        175.0 2
        396 2
        492 2
        767 2
        437 2
        365 2
        270 2
        245 2
        318 2
        570 2
        421 2
        1,227 2
        537 2
        344 2
        1,468 2
        196 2
        596 2
        620 2
        470 2
        380 2
        222 2
        151 2
        519 2
        593 2
        202 2
        418 2
        128.0 2
        154.0 2
        1,809 2
        428 2
        641 2
        610 2
        1,010 1
        933 1
        835 1
        1,988 1
        726 1
        1,616 1
        189.0 1
        899 1
        1,052 1
        746 1
        794 1
        782 1
        1,021 1
        381 1
        897 1
        560 1
        920 1
        130.0 1
        818 1
        745 1
        441 1
        892 1
        316 1
        336.0 1
        690 1
        4,168 1
        235.0 1
        2,239 1
        113.0 1
        550 1
        362.0 1
        1,422 1
        1,283 1
        498 1
        1,771 1
        1,344 1
        406 1
        676 1
        1,079 1
        377.0 1
        476 1
        213 1
        77.0 1
        370 1
        661 1
        486 1
        2,203 1
        1,407 1
        1,692 1
        1,150 1
        337 1
        1,047 1
        1,208 1
        3,196 1
        769 1
        720 1
        1,192 1
        888 1
        1,166 1
        1,355 1
        1,446 1
        721 1
        870 1
        411.0 1
        140.0 1
        194.0 1
        88.0 1
        286.0 1
        423.0 1
        387.0 1
        529.0 1
        600.0 1
        117.0 1
        198.0 1
        485.0 1
        94.0 1
        1,897 1
        8,847 1
        909 1
        4,676 1
        479 1
        112.0 1
        191.0 1
        787 1
        1,866 1
        1,220 1
        1,353 1
        749 1
        2,528 1
        1,244 1
        4,422 1
        2,005 1
        3,442 1
        950 1
        989 1
        433 1
        2,241 1
        1,532 1
        2,773 1
        579 1
        512 1
        517 1
        736 1
        1,016 1
        1,564 1
        557 1
        469 1
        1,143 1
        587 1
        558 1
        1,242 1
        320 1
        2,622 1
        422 1
        850 1
        358 1
        2,865 1
        832 1
        2,574 1
        212 1
        807 1
        1,191 1
        1,121 1
        827 1
        698 1
        973 1
        986 1
        1,551 1
        3,291 1
        1,850 1
        386 1
        1,885 1
        1,549 1
        1,661 1
        286 1
        1,128 1
        1,132 1
        366 1
        2,016 1
        391 1
        462 1
        345 1
        357 1
        756 1
        554 1
        526 1
        1,433 1
        586 1
        1,473 1
        1,250 1
        771 1
        453 1
        1,142 1
        1,503 1
        601 1
        1,061 1
        664 1
        400 1
        917 1
        1,158 1
        1,693 1
        2,189 1
        1,020 1
        820 1
        3,046 1
        1,596 1
        1,853 1
        4,276 1
        1,377 1
        4,015 1
        5,396 1
        567 1
        837 1
        642 1
        2,061 1
        364 1
        1,050 1
        812 1
        200 1
        1,105 1
        1,051 1
        1,049 1
        2,069 1
        630 1
        998 1
        407 1
        2,852 1
        1,832 1
        226 1
        389.0 1
        774 1
        647 1
        1,045 1
        1,040 1
        1,032 1
        775 1
        497 1
        867 1
        2,778 1
        323 1
        849 1
        765 1
        1,681 1
        1,374 1
        2,658 1
        1,022 1
        925 1
        881.0 1
        801.0 1
        430 1
        603.0 1
        520 1
        715 1
        2,115 1
        5,310 1
        559 1
        2,103 1
        862 1
        1,766 1
        1,139 1
        11,201 1
        549 1
        1,101 1
        1,689 1
        1,586 1
        288 1
        1,707 1
        301.0 1
        259 1
        2,335 1
        842 1
        716 1
        2,580 1
        1,361 1
        625 1
        707 1
        448 1
        4,726 1
        371 1
        761 1
        2,089 1
        3,566 1
        1,935 1
        395 1
        5,088 1
        723 1
        1,958 1
        2,251 1
        757 1
        566.0 1
        709 1
        1,943 1
        1,038 1
        1,005 1
        748 1
        252.0 1
        641.0 1
        1,000 1
        141.0 1
        91.0 1
        964 1
        2,943 1
        2,678 1
        510 1
        4,342 1
        340 1
        1,601 1
        353 1
        1,006 1
        2,607 1
        2,359 1
        946 1
        1,274 1
        1,925 1
        389 1
        291 1
        1,559 1
        885 1
        3,781 1
        5,800 1
        1,318 1
        1,431 1
        297 1
        595 1
        581 1
        379 1
        667 1
        429 1
        535 1
        904 1
        446 1
        585 1
        1,983 1
        500 1
        516 1
        447 1
        685 1
        523 1
        1,827 1
        1,119 1
        1,534 1
        2,007 1
        3,577 1
        66.0 1
        73.0 1
        67.0 1
        80.0 1
        106.0 1
        49.0 1
        126.0 1
        139.0 1
        1,294 1
        86.0 1
        253.0 1
        75.0 1
        147.0 1
        736.0 1
        114.0 1
        508 1
        1,923 1
        856 1
        1,622 1
        83.0 1
        144.0 1
        64.0 1
        250 1
        443 1
        1,379 1
        575 1
        992 1
        965 1
        1,371 1
        1,314 1
        305 1
        1,525 1
        1,248 1
        170.0 1
        299.0 1
        160.0 1
        5,961 1
        280 1
        1,138 1
        374 1
        505 1
        3,043 1
        376 1
        1,709 1
        1,881 1
        955 1
        1,103 1
        803 1
        1,065 1
        8,792 1
        3,154 1
        1,744 1
        1,720 1
        815 1
        1,075 1
        1,730 1
        654 1
        2,129 1
        2,075 1
        735 1
        3,736 1
        9,729 1
        943 1
        2,427 1
        373 1
        255 1
        1,076 1
        552 1
        645 1
        2,301 1
        3,317 1
        1,487 1
        547.0 1
        120.0 1
        2,790 1
        911 1
        623 1
        677 1
        1,295 1
        197.0 1
        795 1
        466 1
        660 1
        298 1
        268 1
        511 1
        215.0 1
        153.0 1
        747 1
        3,431 1
        5,971 1
        1,726 1
        799 1
        2,060 1
        3,229 1
        1,945 1
        319 1
        103.0 1
        509 1
        644 1
        213.0 1
        331.0 1
        166.0 1
        548.0 1
        822 1
        2,157 1
        310 1
        813 1
        463 1
        515 1
        1,197 1
        451 1
        858 1
        221 1
        887 1
        851 1
        928 1
        47.0 1
        60.0 1
        281.0 1
        1,262 1
        763 1
        1,912 1
        804 1
        883 1
        629 1
        434 1
        743 1
        688 1
        893 1
        178.0 1
        123.0 1
        167.0 1
        118.0 1
        298.0 1
        105.0 1
        71.0 1
        116.0 1
        95.0 1
        3,352 1
        2,964 1
        12,396 1
        2,630 1
        2,191 1
        2,615 1
        678 1
        784 1
        1,498 1
        2,735 1
        1,174 1
        426 1
        1,758 1
        522 1
        1,679 1
        646 1
        1,561 1
        3,225 1
        2,108 1
        3,378 1
        2,403 1
        829 1
        1,920 1
        1,324 1
        1,774 1
        4,855 1
        2,640 1
        2,283 1
        2,329 1
        834 1
        687 1
        2,559 1
        506 1
        2,186 1
        608 1
        2,037 1
        2,032 1
        994 1
        1,207 1
        988 1
        12,860 1
        530 1
        619 1
        368 1
        1,538 1
        5,260 1
        3,557 1
        1,084 1
        1,475 1
        1,230 1
        1,464 1
        2,487 1
        3,063 1
        10,732 1
        327 1
        10,938 1
        2,896 1
        1,874 1
        4,058 1
        89.0 1
        678.0 1
        442 1
        2,850 1
        2,059 1
        329 1
        333 1
        322 1
        224 1
        242 1
        525 1
        477 1
        539 1
In [ ]:
df['Rating Count'] = df['Rating Count'].str.replace(',', '').fillna(-1).astype(float)
print_unique_elements(df, ['Rating Count'])
Rating Count:
        1.0 194
        3.0 174
        2.0 169
        5.0 121
        4.0 118
        -1.0 104
        7.0 100
        11.0 99
        12.0 96
        15.0 93
        6.0 89
        10.0 86
        16.0 80
        13.0 78
        21.0 74
        18.0 74
        9.0 73
        8.0 73
        17.0 71
        22.0 67
        20.0 67
        14.0 65
        25.0 61
        24.0 59
        19.0 57
        26.0 52
        35.0 51
        23.0 50
        29.0 48
        33.0 47
        30.0 44
        34.0 41
        28.0 41
        31.0 41
        27.0 39
        40.0 37
        36.0 33
        39.0 33
        45.0 32
        41.0 31
        38.0 30
        48.0 30
        44.0 29
        32.0 29
        43.0 28
        63.0 27
        37.0 27
        56.0 27
        53.0 26
        46.0 26
        42.0 26
        47.0 24
        49.0 24
        79.0 24
        50.0 23
        51.0 22
        62.0 22
        61.0 21
        60.0 20
        54.0 19
        57.0 19
        64.0 19
        101.0 19
        52.0 19
        78.0 19
        70.0 19
        66.0 19
        59.0 18
        91.0 18
        81.0 17
        73.0 17
        58.0 16
        84.0 16
        75.0 16
        55.0 16
        92.0 15
        93.0 15
        82.0 14
        65.0 14
        68.0 14
        153.0 13
        89.0 13
        131.0 13
        99.0 13
        96.0 13
        74.0 13
        114.0 12
        77.0 12
        71.0 12
        80.0 12
        94.0 12
        76.0 12
        97.0 12
        107.0 12
        177.0 12
        72.0 12
        102.0 11
        67.0 11
        69.0 11
        86.0 11
        122.0 11
        197.0 11
        175.0 11
        108.0 11
        110.0 11
        123.0 10
        145.0 10
        115.0 10
        87.0 10
        103.0 10
        144.0 10
        109.0 10
        198.0 10
        83.0 10
        105.0 9
        124.0 9
        219.0 9
        139.0 9
        162.0 9
        106.0 9
        128.0 9
        118.0 9
        143.0 9
        116.0 8
        133.0 8
        185.0 8
        112.0 8
        141.0 8
        135.0 8
        100.0 8
        138.0 8
        201.0 8
        104.0 8
        293.0 8
        182.0 8
        150.0 8
        174.0 8
        111.0 7
        300.0 7
        247.0 7
        136.0 7
        209.0 7
        85.0 7
        195.0 7
        218.0 7
        147.0 7
        199.0 7
        98.0 7
        137.0 7
        192.0 7
        132.0 7
        164.0 7
        211.0 7
        231.0 6
        129.0 6
        243.0 6
        301.0 6
        335.0 6
        287.0 6
        155.0 6
        203.0 6
        307.0 6
        167.0 6
        126.0 6
        229.0 6
        217.0 6
        163.0 6
        134.0 6
        309.0 6
        336.0 6
        183.0 6
        88.0 6
        193.0 6
        170.0 6
        237.0 6
        187.0 6
        130.0 6
        165.0 6
        95.0 6
        171.0 6
        412.0 6
        267.0 5
        256.0 5
        121.0 5
        313.0 5
        189.0 5
        161.0 5
        176.0 5
        274.0 5
        450.0 5
        399.0 5
        148.0 5
        194.0 5
        90.0 5
        156.0 5
        377.0 5
        181.0 5
        117.0 5
        387.0 5
        119.0 5
        120.0 5
        168.0 5
        142.0 5
        157.0 5
        166.0 5
        125.0 5
        356.0 5
        190.0 5
        149.0 5
        160.0 5
        248.0 4
        385.0 4
        284.0 4
        216.0 4
        158.0 4
        154.0 4
        306.0 4
        238.0 4
        146.0 4
        417.0 4
        331.0 4
        240.0 4
        295.0 4
        207.0 4
        283.0 4
        279.0 4
        257.0 4
        452.0 4
        173.0 4
        227.0 4
        341.0 4
        235.0 4
        215.0 4
        351.0 4
        325.0 4
        252.0 4
        304.0 4
        186.0 4
        348.0 4
        260.0 4
        236.0 4
        258.0 4
        159.0 4
        423.0 4
        172.0 4
        342.0 4
        220.0 4
        411.0 4
        178.0 4
        475.0 4
        402.0 4
        223.0 4
        403.0 4
        184.0 4
        501.0 4
        548.0 3
        264.0 3
        324.0 3
        425.0 3
        529.0 3
        272.0 3
        338.0 3
        384.0 3
        478.0 3
        244.0 3
        152.0 3
        435.0 3
        768.0 3
        289.0 3
        513.0 3
        308.0 3
        277.0 3
        547.0 3
        398.0 3
        334.0 3
        179.0 3
        282.0 3
        326.0 3
        651.0 3
        278.0 3
        191.0 3
        246.0 3
        532.0 3
        568.0 3
        275.0 3
        483.0 3
        303.0 3
        253.0 3
        254.0 3
        791.0 3
        978.0 3
        261.0 3
        648.0 3
        206.0 3
        311.0 3
        553.0 3
        296.0 3
        494.0 3
        499.0 3
        641.0 3
        208.0 3
        188.0 3
        266.0 3
        140.0 3
        624.0 3
        127.0 3
        392.0 3
        265.0 3
        169.0 3
        113.0 3
        251.0 3
        312.0 3
        302.0 3
        504.0 3
        249.0 3
        528.0 3
        355.0 3
        228.0 3
        204.0 3
        1665.0 3
        882.0 3
        205.0 2
        556.0 2
        932.0 2
        239.0 2
        458.0 2
        332.0 2
        263.0 2
        606.0 2
        1390.0 2
        232.0 2
        793.0 2
        825.0 2
        461.0 2
        610.0 2
        271.0 2
        1098.0 2
        841.0 2
        180.0 2
        432.0 2
        349.0 2
        573.0 2
        1179.0 2
        543.0 2
        314.0 2
        367.0 2
        389.0 2
        665.0 2
        383.0 2
        602.0 2
        290.0 2
        569.0 2
        196.0 2
        330.0 2
        328.0 2
        1217.0 2
        352.0 2
        344.0 2
        424.0 2
        884.0 2
        1469.0 2
        611.0 2
        846.0 2
        555.0 2
        877.0 2
        777.0 2
        214.0 2
        5291.0 2
        731.0 2
        2834.0 2
        210.0 2
        507.0 2
        544.0 2
        594.0 2
        1205.0 2
        347.0 2
        472.0 2
        545.0 2
        609.0 2
        1468.0 2
        631.0 2
        570.0 2
        546.0 2
        574.0 2
        584.0 2
        396.0 2
        225.0 2
        459.0 2
        151.0 2
        650.0 2
        317.0 2
        233.0 2
        294.0 2
        213.0 2
        428.0 2
        588.0 2
        298.0 2
        339.0 2
        365.0 2
        3896.0 2
        241.0 2
        678.0 2
        234.0 2
        418.0 2
        286.0 2
        628.0 2
        401.0 2
        1809.0 2
        1227.0 2
        380.0 2
        427.0 2
        354.0 2
        2452.0 2
        410.0 2
        292.0 2
        414.0 2
        408.0 2
        754.0 2
        682.0 2
        561.0 2
        245.0 2
        222.0 2
        343.0 2
        596.0 2
        1178.0 2
        652.0 2
        405.0 2
        285.0 2
        767.0 2
        202.0 2
        563.0 2
        703.0 2
        620.0 2
        562.0 2
        627.0 2
        492.0 2
        810.0 2
        421.0 2
        470.0 2
        736.0 2
        464.0 2
        1806.0 2
        680.0 2
        577.0 2
        393.0 2
        593.0 2
        455.0 2
        437.0 2
        230.0 2
        394.0 2
        514.0 2
        346.0 2
        350.0 2
        805.0 2
        262.0 2
        318.0 2
        270.0 2
        321.0 2
        537.0 2
        276.0 2
        659.0 2
        1015.0 2
        519.0 2
        480.0 2
        273.0 2
        1915.0 2
        1116.0 2
        431.0 2
        2069.0 1
        1051.0 1
        1049.0 1
        364.0 1
        630.0 1
        1105.0 1
        998.0 1
        200.0 1
        812.0 1
        1050.0 1
        992.0 1
        353.0 1
        407.0 1
        1377.0 1
        1525.0 1
        820.0 1
        1020.0 1
        2189.0 1
        1248.0 1
        1693.0 1
        1158.0 1
        917.0 1
        400.0 1
        664.0 1
        1061.0 1
        601.0 1
        1503.0 1
        1142.0 1
        453.0 1
        2061.0 1
        1250.0 1
        1473.0 1
        1596.0 1
        586.0 1
        1853.0 1
        771.0 1
        726.0 1
        642.0 1
        837.0 1
        1079.0 1
        441.0 1
        892.0 1
        1881.0 1
        2239.0 1
        3196.0 1
        552.0 1
        1076.0 1
        1344.0 1
        1771.0 1
        255.0 1
        745.0 1
        373.0 1
        1407.0 1
        2203.0 1
        2427.0 1
        476.0 1
        943.0 1
        746.0 1
        1616.0 1
        835.0 1
        1446.0 1
        406.0 1
        933.0 1
        782.0 1
        9729.0 1
        794.0 1
        550.0 1
        1047.0 1
        1355.0 1
        899.0 1
        567.0 1
        5396.0 1
        4015.0 1
        1433.0 1
        870.0 1
        661.0 1
        3046.0 1
        897.0 1
        1052.0 1
        381.0 1
        1923.0 1
        920.0 1
        1294.0 1
        299.0 1
        1010.0 1
        508.0 1
        818.0 1
        370.0 1
        486.0 1
        1692.0 1
        1208.0 1
        769.0 1
        720.0 1
        1192.0 1
        888.0 1
        1166.0 1
        305.0 1
        1371.0 1
        2852.0 1
        479.0 1
        1128.0 1
        1661.0 1
        1549.0 1
        1885.0 1
        386.0 1
        1850.0 1
        3291.0 1
        1551.0 1
        973.0 1
        1016.0 1
        787.0 1
        443.0 1
        4676.0 1
        366.0 1
        909.0 1
        8847.0 1
        1897.0 1
        1379.0 1
        485.0 1
        600.0 1
        1622.0 1
        575.0 1
        1866.0 1
        517.0 1
        512.0 1
        579.0 1
        1132.0 1
        2016.0 1
        4276.0 1
        320.0 1
        1121.0 1
        1191.0 1
        250.0 1
        807.0 1
        212.0 1
        2574.0 1
        832.0 1
        2865.0 1
        358.0 1
        850.0 1
        422.0 1
        2622.0 1
        1242.0 1
        391.0 1
        558.0 1
        587.0 1
        1143.0 1
        469.0 1
        827.0 1
        557.0 1
        698.0 1
        986.0 1
        756.0 1
        357.0 1
        345.0 1
        462.0 1
        2773.0 1
        1532.0 1
        2241.0 1
        11201.0 1
        647.0 1
        774.0 1
        881.0 1
        801.0 1
        603.0 1
        1707.0 1
        288.0 1
        1586.0 1
        1689.0 1
        965.0 1
        1101.0 1
        549.0 1
        1139.0 1
        433.0 1
        1766.0 1
        862.0 1
        560.0 1
        2103.0 1
        1314.0 1
        559.0 1
        5310.0 1
        2115.0 1
        715.0 1
        520.0 1
        430.0 1
        1832.0 1
        1045.0 1
        1040.0 1
        1032.0 1
        775.0 1
        989.0 1
        950.0 1
        3442.0 1
        2005.0 1
        4422.0 1
        1244.0 1
        2528.0 1
        749.0 1
        1353.0 1
        1220.0 1
        554.0 1
        1564.0 1
        526.0 1
        226.0 1
        925.0 1
        1022.0 1
        2658.0 1
        1374.0 1
        1681.0 1
        765.0 1
        849.0 1
        721.0 1
        2778.0 1
        867.0 1
        497.0 1
        1988.0 1
        1422.0 1
        3736.0 1
        525.0 1
        322.0 1
        5088.0 1
        723.0 1
        224.0 1
        4342.0 1
        757.0 1
        242.0 1
        340.0 1
        581.0 1
        667.0 1
        429.0 1
        477.0 1
        395.0 1
        1084.0 1
        1207.0 1
        994.0 1
        2032.0 1
        3225.0 1
        1561.0 1
        646.0 1
        1679.0 1
        522.0 1
        1758.0 1
        1174.0 1
        333.0 1
        329.0 1
        2735.0 1
        327.0 1
        1230.0 1
        1475.0 1
        3557.0 1
        988.0 1
        5260.0 1
        1538.0 1
        448.0 1
        368.0 1
        619.0 1
        530.0 1
        12860.0 1
        10938.0 1
        1935.0 1
        2896.0 1
        4726.0 1
        1874.0 1
        4058.0 1
        716.0 1
        371.0 1
        2089.0 1
        442.0 1
        2850.0 1
        2059.0 1
        3566.0 1
        2630.0 1
        1498.0 1
        2487.0 1
        1431.0 1
        595.0 1
        2640.0 1
        4855.0 1
        1774.0 1
        1324.0 1
        1920.0 1
        12396.0 1
        3317.0 1
        2335.0 1
        2301.0 1
        645.0 1
        1318.0 1
        500.0 1
        5800.0 1
        3781.0 1
        291.0 1
        885.0 1
        297.0 1
        1006.0 1
        2607.0 1
        2359.0 1
        1559.0 1
        946.0 1
        1274.0 1
        2283.0 1
        829.0 1
        784.0 1
        2037.0 1
        2615.0 1
        535.0 1
        904.0 1
        2191.0 1
        2108.0 1
        426.0 1
        446.0 1
        3378.0 1
        585.0 1
        379.0 1
        834.0 1
        1983.0 1
        1119.0 1
        608.0 1
        2186.0 1
        516.0 1
        447.0 1
        685.0 1
        506.0 1
        2559.0 1
        687.0 1
        2329.0 1
        523.0 1
        1827.0 1
        1464.0 1
        3063.0 1
        1150.0 1
        803.0 1
        3043.0 1
        376.0 1
        5961.0 1
        1709.0 1
        1262.0 1
        281.0 1
        928.0 1
        887.0 1
        955.0 1
        1103.0 1
        813.0 1
        1065.0 1
        1601.0 1
        221.0 1
        8792.0 1
        858.0 1
        3154.0 1
        451.0 1
        856.0 1
        1197.0 1
        515.0 1
        463.0 1
        1912.0 1
        804.0 1
        763.0 1
        842.0 1
        629.0 1
        676.0 1
        735.0 1
        337.0 1
        2075.0 1
        2129.0 1
        654.0 1
        316.0 1
        690.0 1
        4168.0 1
        1283.0 1
        1730.0 1
        1075.0 1
        498.0 1
        323.0 1
        815.0 1
        1925.0 1
        362.0 1
        1720.0 1
        1744.0 1
        280.0 1
        1138.0 1
        374.0 1
        505.0 1
        1021.0 1
        259.0 1
        883.0 1
        434.0 1
        10732.0 1
        1945.0 1
        2678.0 1
        510.0 1
        1000.0 1
        644.0 1
        2251.0 1
        509.0 1
        761.0 1
        1958.0 1
        319.0 1
        3431.0 1
        2580.0 1
        1361.0 1
        911.0 1
        3229.0 1
        2060.0 1
        799.0 1
        1726.0 1
        625.0 1
        707.0 1
        5971.0 1
        3352.0 1
        747.0 1
        2964.0 1
        2403.0 1
        2790.0 1
        623.0 1
        743.0 1
        268.0 1
        688.0 1
        893.0 1
        3577.0 1
        2007.0 1
        851.0 1
        1534.0 1
        709.0 1
        310.0 1
        2157.0 1
        822.0 1
        511.0 1
        1943.0 1
        2943.0 1
        1038.0 1
        1005.0 1
        660.0 1
        748.0 1
        466.0 1
        795.0 1
        566.0 1
        1295.0 1
        1487.0 1
        677.0 1
        964.0 1
        539.0 1

Fill null values of Rating Count with median of Rating Counts in that field and level¶

In [ ]:
df["Rating Count"].replace(-1, np.nan, inplace=True)
grouping_fields = ["Field", "Seller Level"]

def fill_with_group_median(group):
    median_value = group["Rating Count"].median()
    group["Rating Count"].fillna(median_value, inplace=True)
    return group

df["Rating Count"] = df.groupby(grouping_fields)["Rating Count"].transform(lambda x: x.fillna(x.median()))
In [ ]:
print_unique_elements(df, ['Member Since'])
Member Since:
        May 2020 93
        Dec 2020 91
        Jun 2020 88
        Aug 2020 82
        Oct 2020 79
        Mar 2021 76
        Nov 2020 74
        Jun 2021 70
        Jul 2021 69
        Oct 2021 69
        Dec 2023 67
        Apr 2020 66
        Mar 2020 63
        May 2022 60
        Jul 2020 60
        Dec 2019 60
        Jan 2020 59
        Jan 2021 58
        Aug 2019 58
        Aug 2022 58
        Nov 2021 57
        May 2021 56
        May 2023 56
        Feb 2021 56
        Apr 2022 55
        Aug 2023 53
        Oct 2023 53
        Jan 2023 53
        Apr 2023 52
        Nov 2023 51
        Sep 2022 51
        Sep 2020 51
        Feb 2022 51
        Sep 2021 51
        Jul 2019 50
        Jan 2022 50
        Apr 2021 50
        May 2019 49
        Feb 2020 48
        Nov 2019 46
        Jan 2024 45
        Jul 2022 44
        Nov 2022 44
        Jul 2023 44
        Mar 2022 44
        Jun 2022 43
        Aug 2021 43
        Sep 2019 42
        Feb 2023 42
        Mar 2023 40
        Apr 2019 40
        Oct 2022 39
        Sep 2023 38
        Jan 2018 38
        Jun 2023 37
        Mar 2024 37
        Dec 2021 37
        Oct 2019 36
        Feb 2024 36
        Jan 2019 35
        Feb 2019 34
        Jun 2019 31
        Mar 2019 31
        Sep 2017 31
        Mar 2018 30
        Dec 2022 30
        Oct 2018 30
        Sep 2018 29
        Jan 2017 29
        Apr 2024 27
        May 2018 27
        Nov 2016 24
        Dec 2018 24
        Dec 2016 23
        Oct 2017 23
        Aug 2016 22
        Feb 2018 22
        Feb 2017 22
        Jul 2018 21
        Jun 2016 21
        Jun 2018 20
        May 2017 19
        Aug 2017 18
        Jul 2016 18
        Mar 2017 18
        Oct 2015 18
        Apr 2018 18
        Dec 2017 17
        Jun 2017 17
        Jul 2017 15
        Apr 2016 14
        Nov 2018 13
        Nov 2017 13
        Dec 2014 13
        Aug 2018 12
        Sep 2016 12
        Feb 2016 12
        Jun 2015 11
        Jul 2015 11
        Sep 2015 11
        Oct 2016 11
        May 2015 10
        Apr 2017 10
        Nov 2015 10
        Feb 2015 10
        Mar 2016 10
        May 2014 10
        Jan 2016 9
        Aug 2014 8
        Sep 2013 8
        Jan 2013 8
        Jan 2014 8
        Oct 2014 8
        May 2016 8
        Jul 2014 7
        Dec 2015 7
        Jan 2015 7
        Aug 2015 7
        Feb 2014 7
        Aug 2013 7
        May 2024 6
        Feb 2013 6
        Oct 2013 5
        Mar 2012 5
        Mar 2015 5
        Apr 2015 4
        May 2013 4
        Aug 2012 4
        Sep 2014 4
        Sept 2019 4
        Apr 2014 4
        Jun 2012 4
        Sept 2017 3
        Sept 2020 3
        Jun 2014 3
        Mar 2014 3
        Nov 2013 3
        May 2012 3
        Sept 2021 3
        Jun 2013 3
        Jan 2012 2
        Sep 2012 2
        Dec 2011 2
        Oct 2011 2
        Jul 2012 2
        Nov 2014 2
        Sept 2013 2
        Apr 2012 2
        Apr 2013 2
        Oct 2012 2
        Sept 2022 2
        Aug 2011 1
        Dec 2013 1
        Mar 2011 1
        Sept 2016 1
        Nov 2012 1
        Sept 2018 1
In [ ]:
df['Member Since'] = df['Member Since'].str.replace('Sept', 'Sep')
df["Member Since"] = pd.to_datetime(df["Member Since"], format="%b %Y")
df["Member Since"] = df["Member Since"].dt.strftime("%Y-%m-%d")
In [ ]:
print_unique_elements(df, ['Member Since'])
Member Since:
        2020-05-01 93
        2020-12-01 91
        2020-06-01 88
        2020-08-01 82
        2020-10-01 79
        2021-03-01 76
        2020-11-01 74
        2021-06-01 70
        2021-10-01 69
        2021-07-01 69
        2023-12-01 67
        2020-04-01 66
        2020-03-01 63
        2019-12-01 60
        2022-05-01 60
        2020-07-01 60
        2020-01-01 59
        2019-08-01 58
        2022-08-01 58
        2021-01-01 58
        2021-11-01 57
        2023-05-01 56
        2021-05-01 56
        2021-02-01 56
        2022-04-01 55
        2020-09-01 54
        2021-09-01 54
        2023-01-01 53
        2023-08-01 53
        2022-09-01 53
        2023-10-01 53
        2023-04-01 52
        2022-02-01 51
        2023-11-01 51
        2021-04-01 50
        2022-01-01 50
        2019-07-01 50
        2019-05-01 49
        2020-02-01 48
        2019-11-01 46
        2019-09-01 46
        2024-01-01 45
        2022-11-01 44
        2022-07-01 44
        2022-03-01 44
        2023-07-01 44
        2022-06-01 43
        2021-08-01 43
        2023-02-01 42
        2019-04-01 40
        2023-03-01 40
        2022-10-01 39
        2023-09-01 38
        2018-01-01 38
        2024-03-01 37
        2021-12-01 37
        2023-06-01 37
        2024-02-01 36
        2019-10-01 36
        2019-01-01 35
        2017-09-01 34
        2019-02-01 34
        2019-06-01 31
        2019-03-01 31
        2018-10-01 30
        2018-03-01 30
        2018-09-01 30
        2022-12-01 30
        2017-01-01 29
        2024-04-01 27
        2018-05-01 27
        2016-11-01 24
        2018-12-01 24
        2016-12-01 23
        2017-10-01 23
        2018-02-01 22
        2017-02-01 22
        2016-08-01 22
        2018-07-01 21
        2016-06-01 21
        2018-06-01 20
        2017-05-01 19
        2017-03-01 18
        2015-10-01 18
        2017-08-01 18
        2016-07-01 18
        2018-04-01 18
        2017-06-01 17
        2017-12-01 17
        2017-07-01 15
        2016-04-01 14
        2018-11-01 13
        2016-09-01 13
        2014-12-01 13
        2017-11-01 13
        2018-08-01 12
        2016-02-01 12
        2015-07-01 11
        2015-09-01 11
        2016-10-01 11
        2015-06-01 11
        2014-05-01 10
        2017-04-01 10
        2015-02-01 10
        2016-03-01 10
        2013-09-01 10
        2015-05-01 10
        2015-11-01 10
        2016-01-01 9
        2014-08-01 8
        2014-01-01 8
        2014-10-01 8
        2013-01-01 8
        2016-05-01 8
        2014-02-01 7
        2014-07-01 7
        2015-12-01 7
        2015-01-01 7
        2015-08-01 7
        2013-08-01 7
        2024-05-01 6
        2013-02-01 6
        2013-10-01 5
        2012-03-01 5
        2015-03-01 5
        2012-06-01 4
        2015-04-01 4
        2013-05-01 4
        2012-08-01 4
        2014-09-01 4
        2014-04-01 4
        2012-05-01 3
        2014-06-01 3
        2014-03-01 3
        2013-11-01 3
        2013-06-01 3
        2012-04-01 2
        2011-12-01 2
        2013-04-01 2
        2012-10-01 2
        2012-01-01 2
        2012-09-01 2
        2014-11-01 2
        2012-07-01 2
        2011-10-01 2
        2011-03-01 1
        2011-08-01 1
        2013-12-01 1
        2012-11-01 1
In [ ]:
print_unique_elements(df , ['Avg Response Time'])
Avg Response Time:
        1 hour 2544
        2 hours 505
        3 hours 300
        4 hours 221
        5 hours 131
        6 hours 90
        7 hours 58
        8 hours 55
        9 hours 46
        1 day 45
        2 days 37
        10 hours 31
        11 hours 23
        3 days 19
        12 hours 18
        17 hours 15
        16 hours 11
        13 hours 10
        19 hours 10
        21 hours 9
        4 days 8
        15 hours 8
        14 hours 8
        23 hours 7
        6 days 6
        22 hours 5
        20 hours 4
        12 days 4
        18 hours 3
        11 days 2
        5 days 1
        7 days 1
        29 days 1
        13 days 1
In [ ]:
def convert_to_hours(time_str):
    if isinstance(time_str, str):
        if 'day' in time_str:
            return int(time_str.split(' ')[0]) * 24
        elif 'hour' in time_str:
            return int(time_str.split(' ')[0])
    return time_str

df['Avg Response Time'] = df['Avg Response Time'].apply(convert_to_hours)
print_unique_elements(df , ['Avg Response Time'])
Avg Response Time:
        1.0 2544
        2.0 505
        3.0 300
        4.0 221
        5.0 131
        6.0 90
        7.0 58
        8.0 55
        9.0 46
        24.0 45
        48.0 37
        10.0 31
        11.0 23
        72.0 19
        12.0 18
        17.0 15
        16.0 11
        13.0 10
        19.0 10
        21.0 9
        96.0 8
        15.0 8
        14.0 8
        23.0 7
        144.0 6
        22.0 5
        20.0 4
        288.0 4
        18.0 3
        264.0 2
        120.0 1
        168.0 1
        696.0 1
        312.0 1
In [ ]:
print_unique_elements(df, ['Last Delivery'])
Last Delivery:
        1 day 605
        1 week 450
        2 days 373
        3 days 261
        1 month 210
        2 weeks 209
        4 days 185
        5 days 145
        3 weeks 139
        about 2 hours 107
        2 months 105
        about 4 hours 90
        about 3 hours 82
        about 5 hours 81
        about 10 hours 68
        6 days 67
        about 6 hours 63
        about 7 hours 62
        about 1 hour 56
        about 9 hours 53
        about 11 hours 53
        about 15 hours 52
        7 months 50
        about 8 hours 48
        about 21 hours 43
        4 months 42
        3 months 38
        about 14 hours 38
        about 19 hours 37
        about 17 hours 36
        1 year 36
        about 13 hours 35
        about 12 hours 34
        5 months 32
        about 20 hours 31
        about 22 hours 30
        about 18 hours 29
        about 16 hours 28
        4 weeks 27
        about 23 hours 27
        6 months 23
        8 months 23
        10 months 22
        11 months 15
        9 months 10
        about 45 minutes 6
        2 years 6
        about 8 minutes 4
        about 49 minutes 4
        about 48 minutes 3
        about 11 minutes 3
        about 33 minutes 3
        about 47 minutes 3
        about 1 minute 3
        about 19 minutes 3
        about 52 minutes 3
        about 4 minutes 3
        about 50 minutes 2
        about 57 minutes 2
        just now 2
        about 6 minutes 2
        about 44 minutes 2
        about 28 minutes 2
        about 32 minutes 2
        about 24 minutes 2
        about 2 minutes 2
        about 46 minutes 2
        about 54 minutes 2
        about 12 minutes 2
        about 5 minutes 2
        about 23 minutes 2
        about 21 minutes 2
        about 37 minutes 2
        about 38 minutes 2
        about 53 minutes 2
        about 55 minutes 2
        about 7 minutes 2
        3 years 1
        about 25 minutes 1
        about 42 minutes 1
        about 18 minutes 1
        12 months 1
        about 36 minutes 1
        about 9 minutes 1
        about 15 minutes 1
        about 35 minutes 1
        about 34 minutes 1
        about 16 minutes 1
        about 17 minutes 1
        about 20 minutes 1
        about 29 minutes 1
In [ ]:
def convert_to_days(val):
    if pd.isnull(val):
        return np.nan 
    try:
        quantity, unit = val.split()
        quantity = int(quantity)
        if unit in ['day', 'days']:
            return quantity
        elif unit in ['week', 'weeks']:
            return quantity * 7
        elif unit in ['month', 'months']:
            return quantity * 30
        elif unit in ['year', 'years']:
            return quantity * 365
        else:
            return 0
    except :
        return 0  

df['Last Delivery'] = df['Last Delivery'].apply(convert_to_days)
print_unique_elements(df, ['Last Delivery'])
Last Delivery:
        0.0 1273
        1.0 605
        7.0 450
        2.0 373
        3.0 261
        30.0 210
        14.0 209
        4.0 185
        5.0 145
        21.0 139
        60.0 105
        6.0 67
        210.0 50
        120.0 42
        90.0 38
        365.0 36
        150.0 32
        28.0 27
        180.0 23
        240.0 23
        300.0 22
        330.0 15
        270.0 10
        730.0 6
        1095.0 1
        360.0 1
In [ ]:
df['Last Delivery'].isnull().sum()
Out[ ]:
1574
In [ ]:
df['Language'] = df['Language'].str.replace('I speak ', '')
print_unique_elements(df, ['Language'])
Language:
        English 1905
        English, Spanish 324
        Urdu, English 269
        English, French 133
        English, German 113
        Bengali, English 102
        English, Urdu 94
        Hindi, English 80
        English, Italian 71
        English, Hindi 69
        Bengali, English, Hindi 44
        English, Portuguese 44
        English, Indonesian 44
        English, Russian 43
        English, Arabic 42
        English, Arabic, French 41
        English, Chinese 41
        English, Spanish, French, German 40
        Urdu, English, Hindi 39
        English, Spanish, German, French 38
        Spanish, English 36
        English, Ukrainian 34
        English, Spanish, French 31
        English, French, Spanish 29
        English, Turkish 29
        English, Tagalog 29
        English, French, German, Spanish 27
        English, German, Spanish, French 27
        English, Japanese 26
        English, Portuguese, Spanish 25
        Sinhala, English 23
        English, Urdu, Hindi 23
        English, Hebrew 21
        English, Greek 21
        English, German, French, Spanish 20
        Urdu, English, Punjabi 20
        English, Polish 19
        English, German, Spanish 19
        English, Spanish, Portuguese 19
        English, Dutch 18
        English, German, French 18
        English, Spanish, German 17
        Urdu, Punjabi, English 17
        Urdu, English, Spanish 16
        English, Russian, Ukrainian 16
        English, French, Spanish, German 16
        English, Spanish, Italian 15
        English, Italian, Spanish 15
        English, Arabic, French, Spanish 14
        Urdu, English, Arabic 14
        English, Sinhala 14
        English, Bengali 14
        Urdu, Pashto, English 13
        English, Romanian 13
        English, French, Arabic 12
        English, Swedish 12
        English, French, German 12
        English, Serbian 12
        English, Urdu, Punjabi 12
        English, Italian, French, Spanish 11
        Urdu, Hindi, English 11
        Urdu, English, French 10
        Urdu, English, French, German 10
        Gujarati, English, Hindi 10
        Tamil, English 10
        Hindi, English, Urdu 10
        English, Swahili 9
        English, Gujarati, Hindi 9
        Bengali, English, Spanish, German 9
        English, Bengali, Hindi 9
        English, Hindi, Gujarati 9
        Urdu, English, Chinese 9
        English, French, Spanish, Italian 9
        Spanish 9
        English, Bulgarian 9
        English, Spanish, French, Italian 9
        Urdu, English, Hindi, Punjabi 9
        Indonesian, English 8
        English, Slovenian 8
        English, Urdu, French, Spanish 8
        English, Dutch, German, French 8
        Urdu, English, Spanish, French 8
        English, Korean 8
        English, Vietnamese 8
        Hindi, Urdu, English 8
        English, German, Italian, Spanish 7
        Ukrainian, English 7
        Urdu, Punjabi, English, Arabic 7
        English, Ukrainian, Polish 7
        English, Italian, Spanish, French 7
        Urdu, English, German 6
        English, Hungarian, German 6
        English, Albanian 6
        English, German, Spanish, Dutch 6
        Bengali, Hindi, English 6
        Urdu, Punjabi, English, Hindi 6
        English, Malay 6
        Urdu, English, Punjabi, Hindi 6
        French, English 6
        Bengali, English, Hindi, Urdu 6
        English, Urdu, Pashto 6
        English, Spanish, German, Italian 6
        English, German, French, Italian 6
        English, Italian, Spanish, German 5
        English, Estonian, Spanish 5
        English, Armenian, German, Russian 5
        English, Persian 5
        English, Greek, Russian 5
        English, Ukrainian, Russian 5
        English, Portuguese, Spanish, Italian 5
        English, Spanish, Italian, French 5
        English, Ukrainian, Russian, Spanish 5
        Hindi, Gujarati, English 5
        Telugu, English 5
        Russian, Ukrainian, English 5
        English, Italian, French 5
        English, Hindi, Spanish, French 5
        English, Dutch, German 5
        English, Arabic, French, German 5
        Italian, English 5
        English, Hindi, Punjabi 5
        Bengali, English, Hebrew, Hindi 5
        Urdu, Punjabi, Hindi, English 5
        English, French, Italian 5
        Bengali, English, Spanish 5
        Bengali, English, German 5
        English, Portuguese, French 5
        English, Spanish, Italian, German 4
        Russian, English 4
        Chinese, English 4
        English, German, French, Thai 4
        English, Swahili, Kikuyu 4
        Urdu, English, Spanish, German 4
        English, Indonesian, Malay, German 4
        English, Polish, Spanish 4
        French, German, Spanish, English 4
        English, Nepali 4
        English, German, Spanish, Arabic 4
        Bengali, English, Hindi, Spanish 4
        Urdu, English, Hindi, Spanish 4
        English, Spanish, Catalan 4
        English, Portuguese, German, French 4
        English, Croatian 4
        English, Dutch, French 4
        English, Urdu, German 4
        English, Hungarian 4
        German, English 4
        English, Urdu, Arabic 4
        English, Spanish, Dutch, German 4
        English, Georgian, Russian 4
        Bengali, English, Hindi, Hebrew 4
        English, Hindi, French, Spanish 4
        English, Slovak 4
        Urdu, English, German, French 4
        English, French, Arabic, Spanish 4
        English, Indonesian, Malay 4
        Bengali, English, Spanish, French 4
        English, Urdu, Spanish 4
        English, Russian, French 4
        English, Spanish, Portuguese, Turkish 4
        English, Slovenian, Croatian, Latin 4
        Urdu, English, Hindi, French 4
        English, Dutch, Spanish, German 4
        Hindi, English, Spanish, French 4
        English, Urdu, Hindi, Punjabi 4
        English, Hindi, Marathi 4
        English, Russian, German 4
        English, Macedonian, Serbian 3
        English, Greek, Spanish 3
        English, Urdu, German, French 3
        English, Ukrainian, Russian, Polish 3
        English, Urdu, Arabic, German 3
        Urdu, English, Arabic, French 3
        Bengali, English, German, French 3
        English, Spanish, Galician 3
        English, German, Greek 3
        Hindi, Bengali, English 3
        English, Thai 3
        English, Hindi, German, Spanish 3
        English, Italian, French, German 3
        English, Korean, German, Thai 3
        Ukrainian, Russian, English 3
        English, Afrikaans 3
        English, Urdu, Spanish, French 3
        English, Norwegian Bokmål, Swedish, Danish 3
        English, German, French, Korean 3
        Turkish, English 3
        English, Turkish, German 3
        Urdu, English, Punjabi, Arabic 3
        English, German, Thai, Spanish 3
        English, Urdu, Spanish, German 3
        English, French, Dutch 3
        Urdu, English, German, Spanish 3
        English, German, Italian 3
        English, Hebrew, French, Spanish 3
        Punjabi, English, Hindi 3
        Hindi, Punjabi, English 3
        English, Indonesian, Chinese 3
        English, Arabic, Spanish, French 3
        English, Danish, Swedish, German 3
        English, Dutch, French, German 3
        Arabic, English 3
        English, French, Spanish, Dutch 3
        Urdu, English, Hindi, Arabic 3
        English, Serbian, Portuguese, Spanish 3
        English, French, Russian, German 3
        English, Hungarian, Romanian 3
        English, Chinese, German 3
        English, French, German, Italian 3
        Urdu, Hindi, English, Punjabi 3
        English, Serbian, German 3
        English, Portuguese, French, Spanish 3
        Hindi, English, French 3
        English, Arabic, German, French 3
        English, Hindi, Urdu 3
        Yoruba, English 3
        English, German, French, Arabic 3
        English, Portuguese, Spanish, French 3
        English, Yoruba 3
        English, Tamil 3
        English, Arabic, German, Spanish 3
        German, French, Spanish, English 3
        English, Sindhi, Urdu, Hindi 2
        Urdu, English, Arabic, Spanish 2
        English, Russian, Ukrainian, Polish 2
        Portuguese, English 2
        Sindhi, English, Urdu 2
        English, Bengali, Hindi, Urdu 2
        English, Georgian 2
        English, Portuguese, Spanish, German 2
        English, Russian, Polish, French 2
        English, Portuguese, German, Spanish 2
        English, Spanish, Portuguese, German 2
        English, Serbian, Russian 2
        English, Danish 2
        Bengali, English, Arabic 2
        Hindi, English, Punjabi 2
        Urdu, English, French, Spanish 2
        English, German, Western Frisian 2
        German, English, French 2
        English, Latvian, Russian 2
        English, Finnish, Swedish 2
        Sinhala, English, German 2
        English, Chinese, Thai, Lao 2
        English, Chinese, Malay, Japanese 2
        English, Spanish, Russian, French 2
        English, Russian, Chinese 2
        English, Greek, Turkish, Arabic 2
        English, Spanish, French, Arabic 2
        English, Urdu, French, German 2
        Bengali, English, Hindi, Arabic 2
        English, French, Danish 2
        English, Turkish, German, French 2
        English, Italian, Greek 2
        English, Spanish, Catalan, French 2
        English, Indonesian, German, French 2
        English, Macedonian, Bulgarian, Serbian 2
        Belarusian, English 2
        Bengali, English, Hindi, German 2
        Urdu, Sindhi, English, Punjabi 2
        English, Ukrainian, German 2
        English, Spanish, Hebrew, German 2
        English, Chinese, Japanese 2
        English, Spanish, German, Chinese 2
        English, Spanish, Italian, Portuguese 2
        English, Finnish 2
        English, Hindi, Nepali, Bengali 2
        English, Russian, Ukrainian, German 2
        English, French, German, Portuguese 2
        English, Russian, Hebrew 2
        English, Romanian, Italian 2
        English, Russian, Belarusian 2
        English, Arabic, Spanish 2
        English, Lithuanian 2
        English, Norwegian Bokmål, Spanish 2
        English, Estonian 2
        English, German, Turkish, Indonesian 2
        English, Armenian, Russian 2
        Tamil, English, Sinhala, Malayalam 2
        English, Croatian, Italian 2
        English, Polish, Russian 2
        English, Greek, French 2
        English, Spanish, Hindi 2
        English, French, Italian, Spanish 2
        English, Italian, Portuguese 2
        English, Dutch, Spanish, French 2
        English, French, Chinese 2
        English, Sinhala, Tamil 2
        English, Russian, Croatian, Serbian 2
        English, Russian, Ukrainian, Czech 2
        Business plans 2
        English, French, Portuguese, German 2
        English, Ukrainian, Polish, Italian 2
        English, French, Italian, German 2
        English, Urdu, French 2
        English, Yoruba, Spanish 2
        English, French, Bulgarian 2
        English, Romanian, Russian 2
        English, Spanish, Japanese 2
        English, Hebrew, Spanish 2
        English, Japanese, German 2
        English, Italian, German, Spanish 2
        English, Slovenian, Croatian, German 2
        English, Spanish, Russian, German 2
        Norwegian Bokmål, English 2
        English, German, Arabic, French 2
        Urdu, English, French, Chinese 2
        English, Hindi, Marathi, Gujarati 2
        English, Hausa, Igbo 2
        English, Punjabi, Urdu 2
        English, Hindi, Nepali 2
        English, Italian, Romanian 2
        English, German, Hindi 2
        English, Urdu, Pashto, Hindi 2
        Pashto, Urdu, English 2
        Urdu, English, Arabic, German 2
        Thai, Korean, English, Urdu 2
        Bengali, English, Spanish, Arabic 2
        English, German, Arabic, Spanish 2
        English, Serbian, Croatian, Bosnian 2
        Hindi, English, German, Spanish 2
        English, Spanish, Dutch, Italian 2
        English, Kannada, Hindi 2
        Spanish, French, German, English 2
        Bengali, English, Italian, Hindi 2
        Bengali, English, German, Spanish 2
        Bengali, English, French, German 2
        Urdu, English, Spanish, Arabic 2
        English, German, Spanish, Italian 2
        English, German, Russian, French 2
        English, Urdu, Italian, German 2
        English, French, Urdu 2
        English, Thai, Korean, Indonesian 2
        Bengali, English, French, Spanish 2
        English, French, German, Arabic 2
        English, Spanish, Ukrainian 2
        English, German, Turkish 2
        Hindi, English, Korean, Thai 2
        English, Urdu, Spanish, Arabic 2
        English, Indonesian, Javanese 2
        Arabic 2
        English, German, Thai 2
        English, Russian, Italian 2
        English, Igbo 2
        French, English, German, Spanish 2
        English, Thai, Korean 2
        Urdu, Sindhi, English 2
        Urdu, English, Spanish, Russian 2
        Urdu, English, Arabic, Chinese 2
        English, Urdu, Hindi, Arabic 2
        English, German, Russian 2
        Pashto, English, Urdu 2
        English, Albanian, Greek, Italian 2
        Hindi, English, Spanish 2
        English, German, Croatian 2
        English, German, Dutch 2
        English, German, Spanish, Russian 2
        English, Urdu, French, Arabic 2
        English, Hebrew, Russian 2
        Sindhi, English, Urdu, Hindi 2
        Vietnamese, English 2
        Russian, Ukrainian, English, Polish 2
        English, German, French, Bengali 1
        French, Spanish, German, Chinese 1
        English, Urdu, German, Hindi 1
        English, Czech, Italian, German 1
        English, Spanish, Georgian 1
        English, Hindi, Arabic, Spanish 1
        English, Hindi, Spanish, Arabic 1
        English, Arabic, Urdu, Chinese 1
        English, German, French, Dutch 1
        English, German, Danish 1
        Spanish, English, French 1
        Urdu, Sindhi, Hindi, English 1
        Polish, English, German 1
        English, Indonesian, Dutch, German 1
        English, Greek, Russian, German 1
        French, Italian, Urdu, English 1
        German, Spanish, English 1
        English, German, Russian, Arabic 1
        English, Urdu, German, Korean 1
        Bengali, English, Hindi, French 1
        Sindhi, Urdu, Hindi, English 1
        Urdu, Hindi, English, French 1
        Abkhazian, Chinese, Spanish, English 1
        Urdu, English, Pashto 1
        Urdu, English, Arabic, Hindi 1
        Urdu, Pashto, Hindi, English 1
        Russian, English, German 1
        English, Turkish, Russian, Polish 1
        English, Croatian, Bosnian, German 1
        English, Polish, German, Spanish 1
        English, Uzbek 1
        English, Romanian, German 1
        English, Persian, Kurdish, Turkish 1
        Spanish, Galician, Portuguese, English 1
        English, Albanian, Spanish, Chinese 1
        English, Italian, Hungarian 1
        Urdu, English, Turkish 1
        English, Bengali, Hindi, Malay 1
        Urdu, English, French, Arabic 1
        English, Bengali, German, Finnish 1
        English, German, French, Norwegian Bokmål 1
        English, Bengali, Spanish, German 1
        English, Spanish, German, Hebrew 1
        German, Hebrew, French, English 1
        Bengali, Dutch, Spanish, English 1
        English, Romanian, Hungarian 1
        Arabic, English, French 1
        English, Spanish, Portuguese, French 1
        Hindi, English, Arabic 1
        English, Spanish, Hebrew, French 1
        Urdu, English, Hindi, German 1
        English, Hindi, Punjabi, Spanish 1
        Bengali, Hindi, English, Urdu 1
        English, Uzbek, Russian, Korean 1
        Urdu, English, German, Italian 1
        English, French, Arabic, German 1
        English, Turkish, Spanish, French 1
        Urdu, English, Pashto, Punjabi 1
        Telugu, English, Hindi, Kannada 1
        English, Russian, Romanian, Greek 1
        English, Russian, Ukrainian, Belarusian 1
        English, Turkish, Russian 1
        Arabic, English, German, French 1
        English, French, Arabic, Russian 1
        English, Hindi, French, Arabic 1
        English, Hindi, Portuguese 1
        Spanish, English, German, Italian 1
        Nepali, Hindi, English 1
        English, Urdu, Sindhi 1
        English, Urdu, Dutch 1
        Hindi, English, Bengali 1
        Arabic, German, English, French 1
        English, Chinese, Spanish, French 1
        Urdu, Punjabi, English, Spanish 1
        Flutter 1
        English, Urdu, German, Spanish 1
        Bengali, English, Spanish, Hindi 1
        Urdu, English, Russian, Dutch 1
        English, French, Spanish, Russian 1
        English, Arabic, Urdu 1
        English, Russian, French, Turkish 1
        English, Nepali, Hindi 1
        Bengali, English, German, Hebrew 1
        Bengali, English, Chinese, German 1
        French, German, English 1
        English, Spanish, Italian, Russian 1
        English, German, Turkish, French 1
        Hindi, Gujarati, Tamil, English 1
        English, German, Portuguese 1
        English, French, Dutch, Italian 1
        English, Arabic, French, Dutch 1
        English, Bengali, Spanish 1
        English, Croatian, German, Swedish 1
        English, Hindi, German, Italian 1
        English, German, Spanish, Hebrew 1
        Urdu, English, Spanish, Chinese 1
        English, German, Spanish, Swedish 1
        French, Arabic, Spanish, English 1
        English, Hebrew, Spanish, German 1
        English, Turkmen, Russian 1
        English, Ukrainian, Italian 1
        English, Danish, French, German 1
        Spanish, Urdu, English, Portuguese 1
        English, Spanish, French, Chinese 1
        English, Macedonian, Serbian, Spanish 1
        English, Italian, German 1
        Russian, English, Polish 1
        Spanish, Arabic, German, English 1
        English, French, Spanish, Hindi 1
        English, Russian, Spanish, French 1
        English, Romanian, French 1
        English, Swahili, Spanish, German 1
        English, Swedish, Spanish, French 1
        English, Bulgarian, German, Spanish 1
        Haitian, French, English 1
        English, Turkish, French 1
        English, French, Yoruba 1
        English, Serbian, Croatian, Russian 1
        English, Spanish, Portuguese, Armenian 1
        English, Polish, Japanese 1
        English, Kinyarwanda, French 1
        Urdu, German, English 1
        English, German, Persian, Chinese 1
        English, Persian, Arabic, Polish 1
        English, Bulgarian, German 1
        English, Russian, Spanish, Ukrainian 1
        English, Spanish, French, Dutch 1
        Sinhala, English, French, Spanish 1
        English, Hebrew, French 1
        English, Tamil, Malayalam 1
        English, Czech 1
        English, Spanish, French, Japanese 1
        English, French, German, Icelandic 1
        English, Japanese, Hebrew 1
        English, Tagalog, Spanish, Italian 1
        English, Mongolian, Albanian 1
        Urdu, English, Spanish, Hindi 1
        Hindi, English, Marathi 1
        Arabic, Urdu, English, Pashto 1
        English, Swedish, Italian, German 1
        English, Hindi, Dutch 1
        French, Spanish, English 1
        English, Russian, Ukrainian, Spanish 1
        Arabic, English, French, Spanish 1
        Igbo, English, German 1
        English, Urdu, Russian, Spanish 1
        English, Yoruba, French 1
        Spanish, English, Portuguese 1
        English, Interlingua, French 1
        Kannada, English, Hindi, Marathi 1
        English, Spanish, Dutch, Portuguese 1
        English, Hebrew, Lithuanian 1
        English, Spanish, French, Hebrew 1
        English, Japanese, German, Italian 1
        English, Hindi, German 1
        Urdu, English, Portuguese, Spanish 1
        English, Hindi, Malayalam, Tamil 1
        Bengali, Spanish, English, Hindi 1
        Dutch, Arabic, English, French 1
        German, French, Albanian, English 1
        English, Faroese, German 1
        Hindi, French, English, German 1
        German, English, French, Italian 1
        English, Chinese, Spanish, Arabic 1
        English, Arabic, Spanish, Russian 1
        English, Urdu, Arabic, Spanish 1
        English, Tamil, Telugu 1
        Real estate analysis 1
        English, German, Greek, Latin 1
        English, Korean, Chinese, Japanese 1
        English, Arabic, Turkish, Russian 1
        English, Dutch, German, Chinese 1
        Bengali, English, Hindi, Korean 1
        English, French, Persian, Spanish 1
        English, Hindi, Bengali 1
        English, Urdu, Italian, French 1
        English, Maltese, Italian, Turkish 1
        English, Portuguese, German 1
        English, Dutch, German, Spanish 1
        English, Spanish, French, Russian 1
        English, Croatian, German, Slovenian 1
        Portuguese, English, Spanish, French 1
        English, Swahili, Chinese 1
        English, Swedish, Spanish 1
        English, German, Dutch, Indonesian 1
        Spanish, Arabic 1
        English, Greek, Italian 1
        English, Italian, French, Arabic 1
        English, Spanish, Arabic, Chinese 1
        English, Malayalam, Hindi, Tamil 1
        English, Ukrainian, Hebrew 1
        English, Urdu, Chinese, Spanish 1
        Tamil, English, Hindi 1
        Tamil, English, French, German 1
        English, Serbian, Croatian, Macedonian 1
        English, Spanish, French, Hindi 1
        English, Portuguese, French, German 1
        English, Kazakh, Ukrainian 1
        Hindi, Marathi, English 1
        Marathi, English, Hindi 1
        English, German, Spanish, Finnish 1
        English, Indonesian, Spanish 1
        English, Welsh 1
        English, Portuguese, Japanese, Spanish 1
        French, German, Portuguese, English 1
        English, Chinese, Russian, German 1
        English, Ukrainian, Russian, Slovak 1
        English, Khmer 1
        English, Greek, German 1
        English, Slovak, Czech, German 1
        English, Spanish, French, Portuguese 1
        English, French, Haitian, Spanish 1
        English, German, Luxembourgish, French 1
        Swedish, English 1
        English, Dutch, Portuguese, French 1
        English, Swedish, German 1
        English, Spanish, Latin, Portuguese 1
        English, Polish, German, French 1
        English, Irish 1
        English, Arabic, Chinese, French 1
        Turkish 1
        English, Greek, Latin 1
        English, Kazakh, Russian 1
        English, Slovenian, Bosnian, Spanish 1
        English, Chinese, Indonesian 1
        English, Vietnamese, Korean 1
        English, Greek, French, Spanish 1
        English, Finnish, Swedish, Spanish 1
        English, Chinese, French 1
        English, French, Hebrew 1
        English, Bulgarian, Spanish, Russian 1
        English, Scottish Gaelic 1
        Data analysis 1
        Hindi, English, Spanish, German 1
        English, Ukrainian, German, Spanish 1
        English, Korean, Thai, Spanish 1
        English, Albanian, Spanish 1
        English, Hebrew, Italian, French 1
        English, Polish, Spanish, German 1
        English, Dutch, Portuguese, Korean 1
        English, Norwegian Bokmål 1
        English, Serbian, Macedonian, German 1
        Telugu, Hindi, English 1
        English, Serbian, Bulgarian 1
        English, Korean, Arabic 1
        Urdu, English, Dutch 1
        Thai, Korean, Spanish, English 1
        English, Tagalog, German 1
        Hindi, Tamil, English 1
        English, Serbian, Spanish 1
        English, German, Spanish, Thai 1
        Urdu, English, French, Italian 1
        English, Marathi, Hindi, German 1
        English, Serbian, Croatian, German 1
        English, Romanian, Russian, French 1
        Urdu 1
        English, Lithuanian, Russian, Ukrainian 1
        English, Indonesian, Italian 1
        Tagalog, English, Spanish 1
        English, Urdu, Punjabi, Pashto 1
        English, Turkish, Urdu, Hindi 1
        English, Slovenian, Italian 1
        English, Hindi, Gujarati, Marathi 1
        English, Thai, Indonesian, Korean 1
        Thai, English, Korean, Indonesian 1
        English, Afrikaans, Dutch 1
        Bulgarian, English 1
        English, Hebrew, Russian, Korean 1
        English, Haitian 1
        English, Arabic, Spanish, German 1
        English, Malay, Chinese, Japanese 1
        English, Arabic, Turkish 1
        Punjabi, Urdu, English, Hindi 1
        English, Japanese, Chinese 1
        English, Spanish, Italian, Esperanto 1
        Urdu, Arabic, English 1
        Indonesian, Thai, English, Korean 1
        Russian, Belarusian, English, French 1
        English, Korean, Spanish, French 1
        English, Slovak, Czech 1
        English, Arabic, French, Japanese 1
        English, Kazakh, Turkish 1
        Punjabi, Urdu, English, German 1
        English, Spanish, Urdu, French 1
        English, Korean, Italian, Thai 1
        English, Latvian 1
        Bulgarian, English, Russian 1
        Thai 1
        English, Romanian, Russian, Ukrainian 1
        English, French, German, Thai 1
        English, Ukrainian, Polish, Russian 1
        English, German, Urdu 1
        Thai, English, Korean, Spanish 1
        English, Ukrainian, German, Polish 1
        English, Croatian, Russian, Bulgarian 1
        English, Spanish, German, Korean 1
        Urdu, Sindhi, English, Arabic 1
        English, Belarusian, Ukrainian 1
        Spanish, French, Western Frisian, English 1
        English, Turkish, German, Spanish 1
        Ukrainian, English, Polish 1
        English, Hebrew, Spanish, French 1
        English, Urdu, Arabic, Dutch 1
        French, Hebrew, English 1
        English, German, Hebrew, Chinese 1
        English, Tagalog, Chinese 1
        English, Spanish, Russian 1
        Bengali, Hindi, Urdu, English 1
        English, Urdu, Chinese, Malay 1
        English, Slovak, French 1
        English, Sinhala, Japanese 1
        English, Urdu, Spanish, Portuguese 1
        Bengali, English, Hebrew, Arabic 1
        Urdu, Hindi, Punjabi, English 1
        Turkish, English, German 1
        English, Russian, Turkish, German 1
        French, English, Hebrew 1
        English, Italian, Dutch 1
        Urdu, English, Hindi, Bengali 1
        English, Hebrew, Italian 1
        English, Japanese, French 1
        English, Serbian, Slovak 1
        Uzbek, Russian, English 1
        English, Hindi, Spanish, German 1
        English, Indonesian, Japanese, Chinese 1
        English, Albanian, Italian 1
        Arabic, English, Spanish, German 1
        English, Italian, Arabic, Spanish 1
        English, Hindi, German, French 1
        English, Korean, Thai 1
        Hindi, English, Thai, Korean 1
        German, English, Serbian 1
        English, Arabic, Russian, French 1
        English, Vietnamese, Korean, Japanese 1
        Urdu, Punjabi, English, French 1
        Hindi, English, French, German 1
        English, Urdu, Thai, French 1
        English, Greek, Albanian 1
        English, Spanish, Portuguese, Catalan 1
        English, French, Urdu, Hindi 1
        English, Spanish, Hebrew 1
        English, Spanish, Arabic, French 1
        English, French, German, Korean 1
        English, Spanish, German, Swedish 1
        Japanese, English, German 1
        Hindi, Urdu, English, Spanish 1
        English, Ukrainian, German, Russian 1
        English, Dutch, Italian, Spanish 1
        Tamil, English, Arabic, Urdu 1
        English, German, Russian, Latin 1
        Bengali, English, Hindi, Assamese 1
        English, Russian, French, Spanish 1
        Malayalam, English, Hindi, Tamil 1
        Hindi, English, Spanish, Urdu 1
        English, Serbian, German, Chinese 1
        Bengali, English, Spanish, Hebrew 1
        English, Bulgarian, Russian 1
        Urdu, English, Persian 1
        Urdu, English, Pashto, Hindi 1
        English, Urdu, Chinese 1
        English, Hindi, Telugu, Oriya 1
        Data science 1
        English, Turkish, Armenian 1
        English, Hindi, Marathi, German 1
        English, Italian, French, Dutch 1
        English, Hebrew, Bengali, Hindi 1
        English, Chinese, Russian, Ukrainian 1
        English, Yoruba, French, Arabic 1
        English, Spanish, Estonian 1
        English, German, Romanian 1
        English, Norwegian Bokmål, Norwegian Nynorsk 1
        English, French, Spanish, Arabic 1
        English, Italian, Korean, Spanish 1
        German 1
        English, Spanish, Portuguese, Italian 1
        Telugu, English, Hindi 1
        English, Dutch, Swahili, French 1
        English, Hindi, Malayalam, Kannada 1
        English, French, Korean, German 1
        English, French, German, Dutch 1
        English, Italian, German, French 1
        Punjabi, English, Urdu, Sindhi 1
        English, Malay, Chinese 1
        English, Dutch, Hebrew, Spanish 1
        English, Japanese, Tagalog, Korean 1
        English, Bosnian, Spanish, Norwegian 1
        Swahili, English 1
        Spanish, Catalan, Italian, Russian 1
        English, German, Portuguese, Spanish 1
        English, Urdu, Punjabi, Hindi 1
        English, Malayalam, Hindi, Punjabi 1
        English, French, Korean, Japanese 1
        English, French, Swedish 1
        French, Spanish 1
        Russian, Turkish, Azerbaijani, English 1
        English, Italian, Portuguese, Spanish 1
        English, Tigrinya, Amharic 1
        English, Hindi, Marathi, Punjabi 1
        Chinese, German, English 1
        English, Tamil, Sinhala 1
        English, Spanish, Catalan, Portuguese 1
        English, Urdu, Hindi, French 1
        English, Slovak, Czech, Polish 1
        Romanian, English 1
        Hindi, English, Kannada 1
        English, Slovak, Czech, Spanish 1
        English, Turkish, French, German 1
        Italian 1
        English, Lithuanian, Russian 1
        English, Italian, Spanish, Czech 1
        English, Polish, French, German 1
        English, Italian, Russian, German 1
        English, Romanian, Spanish, German 1
        English, Serbian, Spanish, German 1
        Russian, Romanian 1
        Punjabi, English, Urdu, Hindi 1
        English, Hindi, Arabic 1
        English, Ukrainian, Russian, German 1
        Russian, Romanian, English 1
        English, Igbo, Yoruba, Hausa 1
        English, German, Polish 1
        English, French, German, Luxembourgish 1
        Hindi, Gujarati, English, Urdu 1
        English, Indonesian, Japanese 1
        English, French, Japanese, Spanish 1
        English, Polish, German 1
        English, Bulgarian, Polish, German 1
        English, Hindi, Kannada 1
        English, Xhosa 1
        English, Turkish, Spanish, Russian 1
        English, Hindi, Sanskrit, Spanish 1
        English, Bengali, Spanish, French 1
        English, Hindi, Spanish, Hebrew 1
        Hindi, English, Gujarati, Assamese 1
        Assamese, English, Hindi, Bengali 1
        Arabic, Spanish, English 1
        English, French, Sinhala 1
        English, Bengali, Urdu, Hindi 1
        English, Georgian, Turkish, Spanish 1
        Hindi, English, German, Urdu 1
        English, Indonesian, Korean 1
        Bengali, English, Spanish, Russian 1
        English, Urdu, Turkish 1
        English, Dutch, Spanish 1
        Urdu, English, Dutch, French 1
        English, Bengali, German 1
        English, German, Turkish, Arabic 1
        French 1
        Hindi, Telugu, English 1
        English, Russian, Latvian 1
        English, Bulgarian, Russian, Romanian 1
        English, French, Italian, Arabic 1
        English, Azerbaijani, Turkish, Russian 1
        English, Romanian, Persian 1
        English, Spanish, Latin 1
        Sinhala 1
        English, French, Portuguese 1
        Greek, Romanian, English 1
        English, Italian, German, Russian 1
        English, German, Japanese, Chinese 1
        English, Hindi, Tamil, Telugu 1
        English, French, Japanese 1
        Italian, Spanish, English, Portuguese 1
        English, German, Hindi, French 1
        Spanish, Portuguese 1
        English, Italian, Spanish, Russian 1
        Italian, English, Spanish, German 1
        English, Spanish, Russian, Armenian 1
        English, German, Swedish, French 1
        Spanish, Portuguese, English 1
        English, Indonesian, Turkish 1
        English, Spanish, German, Hindi 1
        Urdu, English, Punjabi, Spanish 1
        English, Chinese, Hungarian 1
        English, Turkish, Japanese, German 1
        Spanish, German, English 1
        English, Korean, German, Spanish 1
        Bengali, Hebrew, Hindi, English 1
        English, Hindi, Tagalog, Spanish 1
        English, Hindi, French, German 1

Encode languages with more Than %1 repeat in all records using one-hot-encoding¶

In [ ]:
df.loc[df['Language'].notna(), 'Language'] = df['Language'].dropna().str.split(',').apply(lambda x: ','.join(i.strip() for i in x))
df_encoded = df['Language'].str.get_dummies(sep=',')

percentage = df_encoded.mean()
cols_to_drop = percentage[percentage < 0.01].index
df_encoded = df_encoded.drop(cols_to_drop, axis=1)

df = pd.concat([df, df_encoded], axis=1)
df_encoded.info()
<class 'pandas.core.frame.DataFrame'>
Index: 5922 entries, 0 to 6261
Data columns (total 18 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   Arabic      5922 non-null   int64
 1   Bengali     5922 non-null   int64
 2   Chinese     5922 non-null   int64
 3   Dutch       5922 non-null   int64
 4   English     5922 non-null   int64
 5   French      5922 non-null   int64
 6   German      5922 non-null   int64
 7   Hebrew      5922 non-null   int64
 8   Hindi       5922 non-null   int64
 9   Indonesian  5922 non-null   int64
 10  Italian     5922 non-null   int64
 11  Portuguese  5922 non-null   int64
 12  Punjabi     5922 non-null   int64
 13  Russian     5922 non-null   int64
 14  Spanish     5922 non-null   int64
 15  Turkish     5922 non-null   int64
 16  Ukrainian   5922 non-null   int64
 17  Urdu        5922 non-null   int64
dtypes: int64(18)
memory usage: 879.0 KB

Drop Language Column¶

In [ ]:
df.drop('Language', axis=1,inplace=True)
In [ ]:
print_unique_elements(df, ['Order in Queue'])
Order in Queue:
        1 order in queue 822
        2 orders in queue 443
        3 orders in queue 249
        4 orders in queue 159
        5 orders in queue 109
        6 orders in queue 85
        7 orders in queue 50
        8 orders in queue 44
        11 orders in queue 31
        10 orders in queue 29
        9 orders in queue 20
        12 orders in queue 15
        13 orders in queue 11
        16 orders in queue 9
        14 orders in queue 7
        18 orders in queue 7
        24 orders in queue 6
        19 orders in queue 6
        15 orders in queue 6
        22 orders in queue 6
        31 orders in queue 3
        17 orders in queue 3
        25 orders in queue 3
        21 orders in queue 3
        26 orders in queue 3
        38 orders in queue 3
        27 orders in queue 2
        20 orders in queue 2
        56 orders in queue 2
        23 orders in queue 2
        42 orders in queue 1
        30 orders in queue 1
        52 orders in queue 1
        35 orders in queue 1
        29 orders in queue 1
        72 orders in queue 1
        41 orders in queue 1
        45 orders in queue 1
        46 orders in queue 1
        39 orders in queue 1
        54 orders in queue 1
        164 orders in queue 1
        43 orders in queue 1
        44 orders in queue 1
        82 orders in queue 1
        76 orders in queue 1
        215 orders in queue 1
        63 orders in queue 1
        34 orders in queue 1
        161 orders in queue 1
        32 orders in queue 1

for null values we assume that they have 0 order in queue¶

In [ ]:
df['Order in Queue'] = df['Order in Queue'].str.extract('(\d+)').fillna(0).astype(int)
print_unique_elements(df, ['Order in Queue'])
Order in Queue:
        0 3761
        1 822
        2 443
        3 249
        4 159
        5 109
        6 85
        7 50
        8 44
        11 31
        10 29
        9 20
        12 15
        13 11
        16 9
        14 7
        18 7
        19 6
        15 6
        22 6
        24 6
        31 3
        17 3
        25 3
        21 3
        26 3
        38 3
        27 2
        20 2
        56 2
        23 2
        42 1
        30 1
        52 1
        35 1
        29 1
        72 1
        41 1
        45 1
        46 1
        39 1
        54 1
        164 1
        43 1
        44 1
        82 1
        76 1
        215 1
        63 1
        34 1
        161 1
        32 1
In [ ]:
mapping = {"new seller": 1, "level 1": 2, "level 2": 3, "top rated seller": 4}
df["Seller Level"] = df["Seller Level"].map(mapping)
In [ ]:
numeric_fields = ["Seller Level", "Seller In Same Level", "Basic Price","Standard Price","Premium Price","Basic Delivery","Standard Delivery","Premium Delivery","Basic Revision","Standard Revision","Premium Revision","Rating","Rating Count","Avg Response Time","Last Delivery","Order in Queue"]

df_selected = df[numeric_fields]

corr_matrix = df_selected.corr()

plt.figure(figsize=(12, 8))
sns.heatmap(corr_matrix, annot=True, fmt=".2f", cmap="coolwarm", cbar=True)
plt.title("Correlation Matrix Heatmap")
plt.show()
Image
In [ ]:
config = Settings()

profile = ProfileReport(df, title="training data EDA", config=config)
profile.to_file("fiverr_EDA.html")
Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]
Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]
Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]
Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]
In [ ]:
profile
Out[ ]:

price fields are not balance and we need to plot without outliers .¶

In [ ]:
def plot_within_3_sigma(df, fields):
    for field in fields:
        mean = df[field].mean()
        sigma = df[field].std()
        min_val = mean - 3 * sigma
        max_val = mean + 3 * sigma

        filtered_data = df[(df[field] >= min_val) & (df[field] <= max_val)]

        plt.figure(figsize=(10, 6))
        plt.hist(filtered_data[field], bins=30, edgecolor='black')
        plt.title(f'Histogram of {field} within mean ± 3 sigma')
        plt.xlabel(field)
        plt.ylabel('Frequency')
        plt.show()
        plt.close()
In [ ]:
plot_within_3_sigma(df , numeric_fields)
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
In [ ]:
df_single_plan = df[df['is_single_plan'] == True]
df_multi_plan = df[df['is_single_plan'] == False]
In [ ]:
df_single_plan.drop(columns=['Standard Price','Premium Price','Standard Delivery','Premium Delivery','Standard Revision','Premium Revision'],axis=1,inplace=True)
C:\Users\Lenovo\AppData\Local\Temp\ipykernel_13740\4002396670.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_single_plan.drop(columns=['Standard Price','Premium Price','Standard Delivery','Premium Delivery','Standard Revision','Premium Revision'],axis=1,inplace=True)
In [ ]:
df_single_plan.rename(columns={'Basic Price': 'Price','Basic Delivery':'Delivery','Basic Revision':'Revision'},inplace=True)
C:\Users\Lenovo\AppData\Local\Temp\ipykernel_13740\3074068187.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_single_plan.rename(columns={'Basic Price': 'Price','Basic Delivery':'Delivery','Basic Revision':'Revision'},inplace=True)
In [ ]:
print_unique_elements(df_single_plan, ["Revision"])
Revision:
        -1 415
        1 87
        100 71
        2 53
        3 30
        5 12
        4 3
        8 1
        9 1
In [ ]:
df_single_plan.drop(columns=["Revision"],inplace=True)
C:\Users\Lenovo\AppData\Local\Temp\ipykernel_13740\358071252.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_single_plan.drop(columns=["Revision"],inplace=True)
In [ ]:
print_unique_elements(df_single_plan,["Category"])
Category:
        Digital Marketing 158
        Data 111
        Lifestyle 100
        Business 65
        Music & Audio 58
        Programming & Tech 55
        Graphics & Design 40
        Writing & Translation 35
        Video & Animation 30
        Photography 21
In [ ]:
gc.collect()
Out[ ]:
997435
In [ ]:
single_plan_profile = ProfileReport(df_single_plan, title="single plan EDA", config=config)
single_plan_profile.to_file("fiverr_single_plan_EDA.html")
C:\Users\Lenovo\AppData\Roaming\Python\Python311\site-packages\ydata_profiling\utils\dataframe.py:137: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.rename(columns={"index": "df_index"}, inplace=True)
Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]
Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]
Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]
Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]
In [ ]:
gc.collect()
multy_plan_profile = ProfileReport(df_multi_plan, title="single plan EDA", config=config)
multy_plan_profile.to_file("multy_plan_EDA.html")
C:\Users\Lenovo\AppData\Roaming\Python\Python311\site-packages\ydata_profiling\utils\dataframe.py:137: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df.rename(columns={"index": "df_index"}, inplace=True)
Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]
Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]
Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]
Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]
In [ ]:
single_counts = df_single_plan.count()
multi_counts = df_multi_plan.count()
labels = ["Single Plan", "Multi Plan"]

plt.bar(labels[0], single_counts, color="blue", label="Single Plan")
plt.bar(labels[1], multi_counts, color="green", label="Multi Plan")

plt.title("Proportion of Single Plan vs Multi Plan")
plt.ylabel("Proportion")
plt.show()
Image
In [ ]:
plot_within_3_sigma(df_multi_plan , numeric_fields)
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
Image
In [ ]:
plot_within_3_sigma(df_single_plan,['Seller Level','Seller In Same Level','Price','Delivery','Rating','Rating Count','Avg Response Time','Last Delivery','Order in Queue'])
Image
Image
Image
Image
Image
Image
Image
Image
Image
In [ ]:
for level in df_multi_plan['Seller Level'].unique():
    print(f"Shapiro-Wilk Test for level {level}:", stats.shapiro(df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == level]))

    stats.probplot(df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == level], plot=plt)
    plt.title(f"Q-Q plot for level {level}")
    plt.show()
Shapiro-Wilk Test for level 2: ShapiroResult(statistic=0.2886931300163269, pvalue=0.0)
Image
Shapiro-Wilk Test for level 3: ShapiroResult(statistic=0.3321465253829956, pvalue=0.0)
Image
Shapiro-Wilk Test for level 1: ShapiroResult(statistic=0.21009880304336548, pvalue=0.0)
Image
Shapiro-Wilk Test for level 4: ShapiroResult(statistic=0.29845017194747925, pvalue=0.0)
Image

we are not allowed to use anova here so we use Kruskal-Wallis test instead .

In [ ]:
def plot_average_prices(df, group_by_columns, price_columns, height, orientation = 'horizontal',sort = True , original_order=None):
    fig, axs = plt.subplots(1, len(price_columns), figsize=(15, height))

    for i, price_column in enumerate(price_columns):
        average_prices = df.groupby(group_by_columns)[price_column].mean()
        if sort:
            average_prices = average_prices.sort_values(ascending=True)
        if not sort and original_order :
            average_prices = average_prices.reindex(original_order)
        if orientation == 'horizontal':
            average_prices.plot(kind='barh', color='skyblue', ax=axs[i])
            axs[i].set_xlabel(price_column)
        else:
            average_prices.plot(kind='bar', color='skyblue', ax=axs[i])
            axs[i].set_ylabel(price_column)

        axs[i].set_title(f'Average {price_column} for Each {group_by_columns}')
        axs[i].grid(axis='x' if orientation == 'horizontal' else 'y')

    plt.tight_layout()
    plt.show()
In [ ]:
H, pvalue = stats.kruskal(df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Basic Price'][df_multi_plan['Seller Level'] == 4])

print('H-statistic:', H)
print('P-value:', pvalue)
H-statistic: 401.05949076441806
P-value: 1.3051530131546888e-86

based on Kruskal-Wallis test result we can say that there is a meaningfull relation between the basic price and seller level .

In [ ]:
H, pvalue = stats.kruskal(df_multi_plan['Standard Price'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Standard Price'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Standard Price'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Standard Price'][df_multi_plan['Seller Level'] == 4])

print('H-statistic:', H)
print('P-value:', pvalue)
H-statistic: 329.41542370806235
P-value: 4.27045420662165e-71

based on Kruskal-Wallis test result we can say that there is a meaningfull relation between the standard price and seller level .

In [ ]:
H, pvalue = stats.kruskal(df_multi_plan['Premium Price'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Premium Price'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Premium Price'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Premium Price'][df_multi_plan['Seller Level'] == 4])

print('H-statistic:', H)
print('P-value:', pvalue)
H-statistic: 272.7436345345654
P-value: 7.868162097551349e-59

based on Kruskal-Wallis test result we can say that there is a meaningfull relation between the premium price and seller level .

In [ ]:
plot_average_prices(df_multi_plan, ["Seller Level"], ["Basic Price","Standard Price","Premium Price"], 10,'vertical',False)
Image
In [ ]:
for level in df_multi_plan['Seller Level'].unique():
    print(f"Shapiro-Wilk Test for level {level}:", stats.shapiro(df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == level]))

    stats.probplot(df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == level], plot=plt)
    plt.title(f"Q-Q plot for level {level}")
    plt.show()
Shapiro-Wilk Test for level 2: ShapiroResult(statistic=0.6463815569877625, pvalue=0.0)
Image
Shapiro-Wilk Test for level 3: ShapiroResult(statistic=0.5890364050865173, pvalue=0.0)
Image
Shapiro-Wilk Test for level 1: ShapiroResult(statistic=0.5094095468521118, pvalue=0.0)
Image
Shapiro-Wilk Test for level 4: ShapiroResult(statistic=0.6410760879516602, pvalue=1.8357009882655104e-43)
Image

still cant use anova

In [ ]:
basic_H, basic_pvalue = stats.kruskal(df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Basic Delivery'][df_multi_plan['Seller Level'] == 4])

standard_H, standard_pvalue = stats.kruskal(df_multi_plan['Standard Delivery'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Standard Delivery'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Standard Delivery'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Standard Delivery'][df_multi_plan['Seller Level'] == 4])

premium_H, premium_pvalue = stats.kruskal(df_multi_plan['Premium Delivery'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Premium Delivery'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Premium Delivery'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Premium Delivery'][df_multi_plan['Seller Level'] == 4])

print('basic H-statistic:', basic_H)
print('basic P-value:', basic_pvalue)
print('standard H-statistic:', standard_H)
print('standard P-value:', standard_pvalue)
print('premium H-statistic:', premium_H)
print('premium P-value:', premium_pvalue)
basic H-statistic: 306.11297098017025
basic P-value: 4.7283074898406087e-66
standard H-statistic: 187.023382137299
standard P-value: 2.68275113121452e-40
premium H-statistic: 104.02355272123009
premium P-value: 2.119311520067058e-22

based on Kruskal-Wallis test results we can say that there is a meaningfull relation between the all deliveries and seller level .

In [ ]:
plot_average_prices(df_multi_plan, ["Seller Level"], ["Basic Delivery","Standard Delivery","Premium Delivery"], 10,'vertical',False)
Image
In [ ]:
for level in df_multi_plan['Seller Level'].unique():
    print(f"Shapiro-Wilk Test for level {level}:", stats.shapiro(df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == level]))

    stats.probplot(df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == level], plot=plt)
    plt.title(f"Q-Q plot for level {level}")
    plt.show()
Shapiro-Wilk Test for level 2: ShapiroResult(statistic=0.4048423171043396, pvalue=0.0)
Image
Shapiro-Wilk Test for level 3: ShapiroResult(statistic=0.2707735300064087, pvalue=0.0)
Image
Shapiro-Wilk Test for level 1: ShapiroResult(statistic=0.31081730127334595, pvalue=0.0)
Image
Shapiro-Wilk Test for level 4: ShapiroResult(statistic=0.5352907180786133, pvalue=0.0)
Image
In [ ]:
H, pvalue = stats.kruskal(df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == 1],
                          df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == 2],
                          df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == 3],
                          df_multi_plan['Rating Count'][df_multi_plan['Seller Level'] == 4])

print('H-statistic:', H)
print('P-value:', pvalue)
H-statistic: 1017.9958676510408
P-value: 2.2451486399795364e-220

so there is also a meaningful relationship between Rating Count and seller level.

In [ ]:
average_counts = df_multi_plan.groupby("Seller Level")["Rating Count"].mean()
plt.figure(figsize=(10, 6))
average_counts.plot(kind="bar", color="skyblue")
plt.xlabel("Seller Level")
plt.ylabel("Average Rating Count")
plt.title("Average Rating Count for Each Seller Level")
plt.grid(axis="y")
plt.tight_layout()
plt.show()
Image
In [ ]:
W, p = stats.shapiro(df['Avg Response Time'])
print(f'Shapiro-Wilk test statistic: {W}, p-value: {p}')
W, p = stats.shapiro(df['Rating Count'])
print(f'Shapiro-Wilk test statistic: {W}, p-value: {p}')
Shapiro-Wilk test statistic: nan, p-value: 1.0
Shapiro-Wilk test statistic: 0.3117879033088684, p-value: 0.0
c:\Python311\Lib\site-packages\scipy\stats\_morestats.py:1882: UserWarning: p-value may not be accurate for N > 5000.
  warnings.warn("p-value may not be accurate for N > 5000.")
In [ ]:
stats.probplot(df['Avg Response Time'], plot=plt)
plt.show()
stats.probplot(df['Rating Count'], plot=plt)
plt.show()
Image
Image
In [ ]:
df_copy = df.copy()

df_copy.dropna(subset=['Avg Response Time', 'Rating Count'], inplace=True)

x = df_copy['Avg Response Time']
y = df_copy['Rating Count']

pearson_coef, p_value = stats.pearsonr(x, y)
print("Pearson's correlation coefficient:", pearson_coef)
Pearson's correlation coefficient: -0.03211739739230705
In [ ]:
stats.probplot(df['Order in Queue'], plot=plt)
plt.show()
Image
In [ ]:
df_copy = df.copy()

df_copy.dropna(subset=['Avg Response Time', 'Order in Queue'], inplace=True)

x = df_copy['Avg Response Time']
y = df_copy['Order in Queue']

spearman_coef, p_value = stats.spearmanr(x, y)
print("Spearman's correlation coefficient:", spearman_coef)
Spearman's correlation coefficient: -0.053936931694605186
In [ ]:
stats.probplot(df['Rating'], plot=plt)
plt.show()
Image
In [ ]:
df_copy = df.copy()

df_copy.dropna(subset=['Avg Response Time', 'Rating'], inplace=True)

x = df_copy['Avg Response Time']
y = df_copy['Rating']

spearman_coef, p_value = stats.spearmanr(x, y)
print("Spearman's correlation coefficient:", spearman_coef)
Spearman's correlation coefficient: 0.09825559680015179
In [ ]:
plot_average_prices(df_multi_plan, "Category", ["Basic Price", "Standard Price", "Premium Price"],5)
Image
In [ ]:
plot_average_prices(df_multi_plan, ["Category","Field"], ["Basic Price", "Standard Price", "Premium Price"],10)
Image
In [ ]:
def calculate_outlier_percentage(series):
    q1 = series.quantile(0.25)
    q3 = series.quantile(0.75)
    iqr = q3 - q1
    lower_bound = q1 - 1.5 * iqr
    upper_bound = q3 + 1.5 * iqr
    outliers_count = ((series < lower_bound) | (series > upper_bound)).sum()
    total_count = len(series)
    outliers_percentage = (outliers_count / total_count) * 100
    return outliers_percentage
In [ ]:
def plot_boxplot(df, x, y):
    plt.figure(figsize=(15, 6))
    sns.boxplot(y=y, x=x, data=df, orient="h")
    plt.title(f"{x} By {y} Box plot")
    plt.ylabel(y)
    plt.xlabel(x)
    plt.show()

    outliers_percentage = df.groupby(y)[x].apply(calculate_outlier_percentage)
    return outliers_percentage
In [ ]:
print(plot_boxplot(df_multi_plan,"Basic Price","Category"))
print(plot_boxplot(df_multi_plan,"Standard Price","Category"))
print(plot_boxplot(df_multi_plan,"Premium Price","Category"))
Image
Category
Business                  6.880734
Data                     12.641509
Digital Marketing         7.344633
Graphics & Design        11.485149
Lifestyle                 7.344633
Music & Audio             5.769231
Photography               7.228916
Programming & Tech       13.643178
Video & Animation        11.157025
Writing & Translation     9.883721
Name: Basic Price, dtype: float64
Image
Category
Business                 10.091743
Data                     11.698113
Digital Marketing         8.757062
Graphics & Design        13.861386
Lifestyle                 8.662900
Music & Audio             6.213018
Photography               8.132530
Programming & Tech       13.943028
Video & Animation        11.983471
Writing & Translation    10.465116
Name: Standard Price, dtype: float64
Image
Category
Business                 12.844037
Data                     10.943396
Digital Marketing         7.909605
Graphics & Design        13.267327
Lifestyle                 7.344633
Music & Audio             6.656805
Photography               9.638554
Programming & Tech       15.292354
Video & Animation        11.983471
Writing & Translation     9.302326
Name: Premium Price, dtype: float64
In [ ]:
def cat_plot(f1, f2, df, target):
    plt.figure(figsize=(15, 6))
    sns.catplot(x=f1, y=target, hue=f2, data=df, kind="bar", height=6, aspect=2)
    plt.title(f"{target} Distribution by {f1} and {f2}")
    plt.xticks(rotation=45)
    plt.show()
In [ ]:
cat_plot('Seller Level', 'Category', df_multi_plan, 'Basic Price')
c:\Python311\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)
<Figure size 1500x600 with 0 Axes>
Image
In [ ]:
cat_plot("Seller Level", "Category", df_multi_plan, "Standard Price")
c:\Python311\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)
<Figure size 1500x600 with 0 Axes>
Image
In [ ]:
cat_plot("Seller Level", "Category", df_multi_plan, "Premium Price")
c:\Python311\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)
<Figure size 1500x600 with 0 Axes>
Image
In [ ]:
cat_plot("Category", "Seller Level", df_multi_plan, "Premium Price")
c:\Python311\Lib\site-packages\seaborn\axisgrid.py:118: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)
<Figure size 1500x600 with 0 Axes>
Image
In [ ]:
import math
def cat_plot_by_category(df, field_col, target_col, plots_per_row=3):
    categories = df["Category"].unique()
    num_categories = len(categories)
    num_rows = math.ceil(num_categories / plots_per_row)

    fig, axes = plt.subplots(num_rows, plots_per_row, figsize=(8 * plots_per_row, 6 * num_rows))
    axes = axes.flatten()

    for i, category in enumerate(categories):
        ax = axes[i]
        cat_df = df[df["Category"] == category]
        sns.barplot(
            x=field_col, y=target_col, hue=field_col, data=cat_df, ax=ax, dodge=False
        )
        ax.set_title(f"{target_col} Distribution for {category} and Field")
        ax.set_ylabel(target_col)
        ax.set_xlabel("")
        ax.set_xticklabels([])
        ax.legend(title=field_col)

    for j in range(i + 1, len(axes)):
        fig.delaxes(axes[j])

    plt.tight_layout()
    plt.show()
In [ ]:
cat_plot_by_category(df_multi_plan, "Field", "Basic Price")
cat_plot_by_category(df_multi_plan, "Field", "Standard Price")
cat_plot_by_category(df_multi_plan, "Field", "Premium Price")
Image
Image
Image